Originally published at https://blog.contactsunny.com on March 3, 2021.
If you are new to JanusGraph and the Gremlin query language, like I am, you would be confused about the
inE() methods. If you look at examples of these functions, you'll not be able to comprehend the difference easily. Or is it just me?
Anyway, I got confused and it took me a while to understand there is a difference, and there isn’t. Let me explain.
The Sample Graph
Before we look at the differences, let’s look at a sample graph.
As you can see from the graph above, we have four vertices and three edges. The vertex in the middle with the property
"name": "sunny" is the vertex from where we'll start our traversal. The other three vertices are the items that I bought from an e-commerce website. They are a smartphone, a laptop, and a monitor. The relationship is represented with edges labelled
The edges have another property called
count, and as you can tell, they represent the number of times I have bought these items. So I bought three smartphones, two laptops, and one monitor. This is the data we're going to work with.
Now, we’ll first get a reference to our starting vertex with the following query:
sunny = g.V().has('name', 'sunny').next()
We now have all the data we need to understand the difference between these functions.
out() vs. outE()
We already know that we use the outE() function to traverse an edge that is going out of the current vertex. We pass in the label of one or more edges to the function. From our e-commerce example, if I want to get all the items that I have bought, I’ll run the following query:
This would give us all the vertices which have a ‘bought’ relationship with the current vertex. But you’d have also seen the following query for the same use case:
So, they are performing the same traversal and returning the same results. I found out that when you’re using the
outE().in() combination, you can simply replace it with
out(). It's a shorthand or an alias for the long form
outE().in(). But then, why would you use
outE() at all?
Suppose you want to filter or limit the traversal based on other properties of the edge. For example, in our sample graph, I want to get only the items that have bought more than once. We have the
count property for each of our
bought edge. We can use that to filter our vertices. For this, the query is as follows:
As you can see, we can use the
has() function on edges as well to filter out edges with particular property. This ability to filter is not available when you use the
out() function. Because the result of the
out() function is vertices. So if you call the
has() function on that result, you'll be filtering on the vertices and not edges. I hope I'm not complicating things.
in() vs. inE()
It’s the same story with the
inE() functions as well. If you want to filter edges based on extra properties, you use the
inE() function instead of
inE() functions gives you access to more functions that can be used on edges, such as aliasing them using the
as() function, the
count() function, etc. You can have a look at the documentation to see the list of all functions available on edges.
I hope this has not confused you more than you already are. I thought this would clear things our for at least a few people who have the same questions as me when you are still getting started with JanusGraph and Gremlin. Let me know if this helped, or didn’t.