Home United States USA — software PyTorch Geometric vs. Deep Graph Library

PyTorch Geometric vs. Deep Graph Library

March 24, 2022

201

In this article, we compare graph neural networks Deep Graph Library and PyTorch Geometric to decide which GNN Library is best for you and your team.
Join the DZone community and get the full member experience. What is deep learning on graphs? In general, a graph is a system of nodes connected by edges. The nodes typically have some sort of internal state that can be modified by the relationships to other nodes defined by the edges connecting them to other nodes, and these connections and node states can be defined in a wide variety of ways. Deep learning is the repeated application of non-linear transformations to data (representations or activations internally), which are often matrix multiplications or convolutions. Combining deep learning and graphs gives us the fast-growing field of graph neural networks (GNNs). Graphs provide a useful framework for any system readily defined by nodes and relationships, including social networks, molecules, and many other types of real and theoretical systems. Working with data defined as graphs imparts meaningful structural data to the system under study, and is accompanied by a rich toolbox of available mathematical and algorithmic tools. What’s more, because graphs can be described and manipulated with matrix math, they make a fitting complement to deep learning and benefit greatly from the years of development of fast deep learning libraries which use mainly the same mathematical primitives. An adjacency matrix represents the edge connections in a graph. Public domain diagram by rivesunder. Given the rich framework provided by graphs for many types of problems and the substantial successes of deep learning neural networks over the past few decades, it’s no surprise that graph neural networks (GNNs in this article) have garnered increasingly levels of attention and the field has generated its own specialized breakthroughs. Arguably the most exciting accomplishment of deep learning with graphs so far has been the development of AlphaFold and AlphaFold2 by DeepMind, a project that has made major strides in solving the protein structure prediction problem, a long-standing grand challenge of structural biology. With myriad important applications in drug discovery, social networks, basic biology, and many other areas, a number of open-source libraries have been developed for working with graph neural networks. Many of these are mature enough to use in production or research, so how can you go about choosing which library to use when embarking on a new project? Various factors can contribute to the choice of GNN library for a given project. Not least of all is the compatibility with you and your team’s existing expertise: if you are primarily a PyTorch shop it would make sense to give special consideration to PyTorch Geometric, although you might also be interested in using the Deep Graph Library with PyTorch as the backend (DGL can also use TensorFlow as a backend). Likewise, if you are more familiar with TensorFlow and Keras, Spektral may make sense. If you want to develop with the up-and-coming JAX ecosystem, then Jraph might be a good fit for your GNN project. Of course, if your team prefers to work in Julia instead of Python, you would probably want to look at GeometricFlux.jl or GraphNeuralNetworks.jl, both based on the Flux.jl machine learning ecosystem. Like other tools written in Julia and the Julia programming language itself, GeometricFlux.jl and GraphNeuralNetworks.jl are not as well-known and have smaller communities than the more established Python counterparts, but they do offer a few compelling advantages as well. Among the most desirable advantages of Julia-based tools is execution speed, thanks to Julia’s built-in « just-in-time » compilation. While existing familiarity with a given language and similar libraries like PyTorch or TensorFlow contribute significantly to the efficiency of development in terms of the engineering time required to complete a given task, the other major contributor to efficiency in a machine learning project (and the more obvious and recognized of the two) is computational speed. Developer time tends to be a scarce resource and often underestimated constraint in machine learning projects than execution speed of the code itself, but imagine how much longer it would take for the impacts of deep learning to mature if we didn’t have widely available libraries for efficient hardware acceleration on GPUs (and more esoteric and specialized devices). Yet all the specialized hardware in the world won’t do much good without efficient software implementations to match. In this article, we will benchmark and compare two of the most noteworthy open-source libraries for computing with graph neural networks. For the purposes of this comparison, we’ll focus on Python libraries PyTorch Geometric and Deep Graph Library (DGL). As the name implies, PyTorch Geometric is based on PyTorch (plus a number of PyTorch extensions for working with sparse matrices), while DGL can use either PyTorch or TensorFlow as a backend. DGL was used to develop the SE3-Transformer, a translationally and rotationally invariant model that heavily influenced the protein-structure prediction champion model AlphaFold.