Electromagnetic transient (EMT) simulation of power electronics conducted on the CPU slows down as the system scales up. Thus, the massively parallelism of the graphics processing unit (GPU) is utilized to expedite the simulation of the multi-terminal DC (MTDC) grid, where detailed models of the semiconductor switches are adopted to provide comprehensive device-level information. As the large number of nodes leads to an inefficient solution of the DC grid, three levels of circuit partitioning are applied, i.e., the transmission line-based natural separation of converter stations, splitting of the apparatus inside the station, and the coupled voltage-current sources for fine-grained partitioning. Components of similar attributes are written as one CUDA C function and computed in massive parallelism by means of single-instruction multithreading. The GPU's potential as a new EMT simulation platform for the analysis of large-scale MTDC grids is demonstrated by a remarkable speedup of up to 270 times for the Greater CIGRE DC grid with time-steps of 50 ns and 1 μs for device-level and system-level simulation over the CPU implementation. Finally, the accuracy of GPU simulation is validated by the commercial tools SaberRD and PSCAD/EMTDC.