Parkinson’s disease is a neurodegenerative disorder characterized by a variety of motor symptoms. Particularly, difficulties to start/stop movements have been observed in patients. From a technical/diagnostic point of view, these movement changes can be assessed by modeling the transitions between voiced and unvoiced segments in speech, the movement when the patient starts or stops a new stroke in handwriting, or the movement when the patient starts or stops the walking process. This study proposes a methodology to model such difficulties to start or to stop movements considering information from speech, handwriting, and gait. We used those transitions to train convolutional neural networks to classify patients and healthy subjects. The neurological state of the patients was also evaluated according to different stages of the disease (initial, intermediate, and advanced). In addition, we evaluated the robustness of the proposed approach when considering speech signals in three different languages: Spanish, German, and Czech. According to the results, the fusion of information from the three modalities is highly accurate to classify patients and healthy subjects, and it shows to be suitable to assess the neurological state of the patients in several stages of the disease. We also aimed to interpret the feature maps obtained from the deep learning architectures with respect to the presence or absence of the disease and the neurological state of the patients. As far as we know, this is one of the first works that considers multimodal information to assess Parkinson’s disease following a deep learning approach.