The design and implementation of an FPGA core that parallelises all the necessary operations to compute the non-bonded interactions in a MD simulation with the purpose of accelarating the LAMMPS MD software is presented in this paper. Our MD processor core comprised of 4 identical pipelines working independently in parallel to evaluate the non-bonded potentials, forces and virials was implemented on the nodes of a FPGA-based supercomputer named Maxwell. Implementing our FPGA core on multiple nodes of Maxwell allowed us to produce a special-purpose parallel machine for the hardware acceleration of MD simulations. The timing performance figures of this machine for the pairwise LJ and short-range Coulombic (via PPPM) interaction computations in the MD simulations of the solvated Rhodopsin protein systems with various numbers of atom show performance gains over the pure software implementation by factors of up to 13 on two nodes of the Maxwell machine. Furthermore, our MD machine is highly scalable, yielding higher computational power with the additional Maxwell nodes. To our knowledge, this is the first attempt to port an existing production-grade MD software to a FPGA-based parallel computer.