When you study the behavior of an Android application, you would have to determine if a specific Java method is called either by the application code or by a third-party code. The easiest way to tackle this problem is to use Androguard. We will see how it is easy to statically extract the call graph of a given Java method from an Android application.

This article is for educational purpose.

Entire call graph extraction

Androguard comes with an handy script androcg.py meant to dump the entire call graph of the given application into a gml or dot file that can be displayed with Gephi.

androcg.py is really simple to use

usage: androcg.py [-h] [--output OUTPUT] [--show] [--verbose]
                  [--classname CLASSNAME] [--methodname METHODNAME]
                  [--descriptor DESCRIPTOR] [--accessflag ACCESSFLAG]

Create a call graph based on the dataof Analysis and export it into a graph

positional arguments:
  APK                   The APK to analyze

optional arguments:
  -h, --help            show this help message and exit
  --output OUTPUT, -o OUTPUT
                        Filename of the output file, the extension is used to
                        decide which format to use (default callgraph.gml)
  --show, -s            instead of saving the graph, print it with mathplotlib
                        (you might not see anything!
  --verbose, -v         Print more output
  --classname CLASSNAME
                        Regex to filter by classname
  --methodname METHODNAME
                        Regex to filter by methodname
  --descriptor DESCRIPTOR
                        Regex to filter by descriptor
  --accessflag ACCESSFLAG
                        Regex to filter by accessflags
  --no-isolated         Do not store methods which has no xrefs

You can also compute this call graph programmatically

from androguard.misc import AnalyzeAPK

a, d, dx = AnalyzeAPK('path/to/an/application.apk')
call_graph = dx.get_call_graph()

When you have your entire call graph, you can dig into it and manually extract the call graph of a given Java method. That is really painful. But, a tiny bit of graph theory will help us.

Specific call graph extraction

A call graph is basically a directed graph where vertices are Java methods and edges are calls from a method to another. The call graph of a given method is the sub-graph the of the entire call graph only containing the ancestors of the specific method. Ancestors are all the vertices that are reachable by following the edges in the backward direction to one or many roots of the directed call graph.

Androguard allows you to find a specific method in a specific class. To do so, you can use the following line script

from androguard.misc import AnalyzeAPK

a, d, dx = AnalyzeAPK('path/to/an/application.apk')
methods = dx.find_methods(methodname='a method regex', classname='a class regex')

If you are looking for the method getDeviceId declared in the class Landroid/telephony/TelephonyManager, the tiny script looks like that

from androguard.misc import AnalyzeAPK

a, d, dx = AnalyzeAPK('path/to/an/application.apk')
methods = dx.find_methods(methodname='getDeviceId', classname='Landroid/telephony/TelephonyManager')

Finally, you just have to get the ancestors of the vertices contained in methods - methods that match the 2 REGEXes.

from androguard.misc import AnalyzeAPK
import matplotlib.pyplot as plt
import networkx as nx

a, d, dx = AnalyzeAPK('path/to/an/application.apk')
call_graph = dx.get_call_graph()
for m in dx.find_methods(methodname='getDeviceId', classname='Landroid/telephony/TelephonyManager'):
    ancestors = nx.ancestors(cg, m.get_method())
    graph = call_graph.subgraph(ancestors)
    # Drawing
    pos = nx.spring_layout(graph, iterations=500)
    nx.draw_networkx_nodes(graph, pos=pos, node_color='r')
    nx.draw_networkx_edges(graph, pos, arrow=True)
    nx.draw_networkx_labels(graph, pos=pos, labels={x: str(x) for x in graph.nodes}, font_size=8)

Adapt the script to your needs, run it and see the result.

This example shows the call graph of getDeviceId method by obfuscated code.