Skip to content

Commit 66a43cd

Browse files
committed
GP-5018: Some updated PyGhidra docs
1 parent 7fbf64e commit 66a43cd

File tree

2 files changed

+224
-88
lines changed

2 files changed

+224
-88
lines changed

Ghidra/Features/PyGhidra/README.md

+10
Original file line numberDiff line numberDiff line change
@@ -1 +1,11 @@
11
# PyGhidra
2+
3+
This module provides the following capabilities:
4+
* The [PyGhidra Python library](src/main/py/README.md) and its dependencies.
5+
* A [Plugin](src/main/java/ghidra/pyghidra/PyGhidraPlugin.java) that provides a CPython interpreter.
6+
* A [ScriptProvider](src/main/java/ghidra/pyghidra/PyGhidraScriptProvider.java) capable of running
7+
GhidraScripts written in native CPython 3.
8+
* An [interactive python script](support/pyghidra_launcher.py) that Ghidra uses to install
9+
and launch PyGhidra. This script handles
10+
[virtual environments](https://docs.python.org/3/tutorial/venv.html) and
11+
[externally managed environments](https://packaging.python.org/en/latest/specifications/externally-managed-environments/).

Ghidra/Features/PyGhidra/src/main/py/README.md

+214-88
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,59 @@
11
# PyGhidra
22

3-
PyGhidra is a Python library that provides direct access to the Ghidra API within a native CPython interpreter using [jpype](https://jpype.readthedocs.io/en/latest). As well, PyGhidra contains some conveniences for setting up analysis on a given sample and running a Ghidra script locally. It also contains a Ghidra plugin to allow the use of CPython from the Ghidra user interface.
3+
The PyGhidra Python library, originally developed by the
4+
[Department of Defense Cyber Crime Center (DC3)](https://www.dc3.mil) under the name "Pyhidra", is a
5+
Python library that provides direct access to the Ghidra API within a native CPython 3 interpreter
6+
using [JPype](https://jpype.readthedocs.io/en/latest). PyGhidra contains some conveniences for
7+
setting up analysis on a given sample and running a Ghidra script locally. It also contains a Ghidra
8+
plugin to allow the use of CPython 3 from the Ghidra GUI.
9+
10+
## Installation and Setup
11+
Ghidra provides an out-of-the box integraton with the PyGhidra Python library which makes
12+
installation and usage fairly straighforward. This enables the Ghidra GUI and headless Ghidra to run
13+
GhidraScript's written in native CPython 3, as well as interact with the Ghidra GUI through a
14+
built-in REPL. To launch Ghidra in PyGhidra-mode, see Ghidra's latest
15+
[Installation Guide](https://github.com/NationalSecurityAgency/ghidra/blob/master/GhidraDocs/InstallationGuide.md#pyghidra-mode).
16+
17+
It is also possible (and encouraged!) to use PyGhidra as a standalone Python library for usage
18+
in reverse engineering workflows where Ghidra may be one of many components involved. The following
19+
instructions in this document focus on this type of usage.
20+
21+
To install the PyGhidra Python library:
22+
1. Download and install
23+
[Ghidra 11.3 or later](https://github.com/NationalSecurityAgency/ghidra/releases) to a desired
24+
location.
25+
2. Set the `GHIDRA_INSTALL_DIR` environment variable to point to the directory where Ghidra is
26+
installed.
27+
3. Install PyGhidra:
28+
* Online: `pip install pyghidra`
29+
* Offline: `python3 -m pip install --no-index -f
30+
<GhidraInstallDir>/Ghidra/Features/PyGhidra/pypkg/dist pyghidra`
31+
32+
## API
33+
The current version of PyGhidra inherits an API from the original "Pyhidra" project that provides an
34+
excellent starting point for interacting with a Ghidra installation. __NOTE:__ These functions are
35+
subject to change in the future as more thought and feedback is collected on PyGhidra's role in the
36+
greater Ghidra ecosystem:
37+
38+
### pyghidra.start()
39+
To get a raw connection to Ghidra use the `start()` function. This will setup a JPype connection and
40+
initialize Ghidra in headless mode, which will allow you to directly import `ghidra` and `java`.
41+
42+
__NOTE:__ No projects or programs get setup in this mode.
443

5-
PyGhidra was initially developed for use with Dragodis and is designed to be installable without requiring Java or Ghidra. This allows other Python projects
6-
have PyGhidra as a dependency and provide optional Ghidra functionality without requiring all users to install Java and Ghidra. It is recommended to recommend that users set the `GHIDRA_INSTALL_DIR` environment variable to simplify locating Ghidra.
7-
8-
9-
## Usage
10-
11-
12-
### Raw Connection
13-
14-
To get a raw connection to Ghidra use the `start()` function.
15-
This will setup a Jpype connection and initialize Ghidra in headless mode,
16-
which will allow you to directly import `ghidra` and `java`.
44+
```python
45+
def start(verbose=False, *, install_dir: Path = None) -> "PyGhidraLauncher":
46+
"""
47+
Starts the JVM and fully initializes Ghidra in Headless mode.
1748
18-
*NOTE: No projects or programs get setup in this mode.*
49+
:param verbose: Enable verbose output during JVM startup (Defaults to False)
50+
:param install_dir: The path to the Ghidra installation directory.
51+
(Defaults to the GHIDRA_INSTALL_DIR environment variable)
52+
:return: The PhyidraLauncher used to start the JVM
53+
"""
54+
```
1955

56+
#### Example:
2057
```python
2158
import pyghidra
2259
pyghidra.start()
@@ -30,77 +67,62 @@ from java.lang import String
3067
# do things
3168
```
3269

33-
### Customizing Java and Ghidra initialization
34-
35-
JVM configuration for the classpath and vmargs may be done through a `PyGhidraLauncher`.
70+
### pyghidra.started()
71+
To check to see if PyGhidra has been started, use the `started()` function.
3672

3773
```python
38-
from pyghidra.launcher import HeadlessPyGhidraLauncher
39-
40-
launcher = HeadlessPyGhidraLauncher()
41-
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
42-
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
43-
launcher.start()
74+
def started() -> bool:
75+
"""
76+
Whether the PyGhidraLauncher has already started.
77+
"""
4478
```
4579

46-
### Registering an Entry Point
47-
48-
The `PyGhidraLauncher` can also be configured through the use of a registered entry point on your own python project.
49-
This is useful for installing your own Ghidra plugin which uses PyGhidra and self-compiles.
50-
51-
First create an [entry_point](https://setuptools.pypa.io/en/latest/userguide/entry_point.html) for `pyghidra.setup`
52-
pointing to a single argument function which accepts the launcher instance.
53-
80+
#### Example:
5481
```python
55-
# setup.py
56-
from setuptools import setup
82+
import pyghidra
5783

58-
setup(
59-
# ...,
60-
entry_points={
61-
'pyghidra.setup': [
62-
'acme_plugin = acme.ghidra_plugin.install:setup',
63-
]
64-
}
65-
)
84+
if pyghidra.started():
85+
...
6686
```
6787

88+
### pyghidra.open_program()
89+
To have PyGhidra setup a binary file for you, use the `open_program()` function. This will setup a
90+
Ghidra project and import the given binary file as a program for you.
6891

69-
Then we create the target function.
70-
This function will be called every time a user starts a PyGhidra launcher.
71-
In the same fashion, another entry point `pyghidra.pre_launch` may be registered and will be called after Ghidra and all
72-
plugins have been loaded.
92+
Again, this will also allow you to import `ghidra` and `java` to perform more advanced processing.
7393

7494
```python
75-
# acme/ghidra_plugin/install.py
76-
from pathlib import Path
77-
import pyghidra
78-
79-
def setup(launcher):
95+
def open_program(
96+
binary_path: Union[str, Path],
97+
project_location: Union[str, Path] = None,
98+
project_name: str = None,
99+
analyze=True,
100+
language: str = None,
101+
compiler: str = None,
102+
loader: Union[str, JClass] = None
103+
) -> ContextManager["FlatProgramAPI"]: # type: ignore
80104
"""
81-
Run by PyGhidra launcher to install our plugin.
105+
Opens given binary path in Ghidra and returns FlatProgramAPI object.
106+
107+
:param binary_path: Path to binary file, may be None.
108+
:param project_location: Location of Ghidra project to open/create.
109+
(Defaults to same directory as binary file)
110+
:param project_name: Name of Ghidra project to open/create.
111+
(Defaults to name of binary file suffixed with "_ghidra")
112+
:param analyze: Whether to run analysis before returning.
113+
:param language: The LanguageID to use for the program.
114+
(Defaults to Ghidra's detected LanguageID)
115+
:param compiler: The CompilerSpecID to use for the program. Requires a provided language.
116+
(Defaults to the Language's default compiler)
117+
:param loader: The `ghidra.app.util.opinion.Loader` class to use when importing the program.
118+
This may be either a Java class or its path. (Defaults to None)
119+
:return: A Ghidra FlatProgramAPI object.
120+
:raises ValueError: If the provided language, compiler or loader is invalid.
121+
:raises TypeError: If the provided loader does not implement `ghidra.app.util.opinion.Loader`.
82122
"""
83-
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
84-
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
85-
86-
# Install our plugin.
87-
source_path = Path(__file__).parent / "java" / "plugin" # path to uncompiled .java code
88-
details = pyghidra.ExtensionDetails(
89-
name="acme_plugin",
90-
description="My Cool Plugin",
91-
author="acme",
92-
plugin_version="1.2",
93-
)
94-
launcher.install_plugin(source_path, details) # install plugin (if not already)
95123
```
96124

97-
98-
### Analyze a File
99-
100-
To have PyGhidra setup a binary file for you, use the `open_program()` function.
101-
This will setup a Ghidra project and import the given binary file as a program for you.
102-
103-
Again, this will also allow you to import `ghidra` and `java` to perform more advanced processing.
125+
#### Example:
104126

105127
```python
106128
import pyghidra
@@ -113,11 +135,12 @@ with pyghidra.open_program("binary_file.exe") as flat_api:
113135
# We are also free to import ghidra while in this context to do more advanced things.
114136
from ghidra.app.decompiler.flatapi import FlatDecompilerAPI
115137
decomp_api = FlatDecompilerAPI(flat_api)
116-
# ...
138+
...
117139
decomp_api.dispose()
118140
```
119141

120-
By default, PyGhidra will run analysis for you. If you would like to do this yourself, set `analyze` to `False`.
142+
By default, PyGhidra will run analysis for you. If you would like to do this yourself, set `analyze`
143+
to `False`.
121144

122145
```python
123146
import pyghidra
@@ -130,28 +153,65 @@ with pyghidra.open_program("binary_file.exe", analyze=False) as flat_api:
130153
flat_api.analyzeAll(program)
131154
```
132155

133-
134-
The `open_program()` function can also accept optional arguments to control the project name and location that gets created.
135-
(Helpful for opening up a sample in an already existing project.)
156+
The `open_program()` function can also accept optional arguments to control the project name and
157+
location that gets created (helpful for opening up a sample in an already existing project).
136158

137159
```python
138160
import pyghidra
139161

140-
with pyghidra.open_program("binary_file.exe", project_name="EXAM_231", project_location=r"C:\exams\231") as flat_api:
162+
with pyghidra.open_program("binary_file.exe", project_name="MyProject", project_location=r"C:\projects") as flat_api:
141163
...
142164
```
143165

144-
145-
### Run a Script
146-
147-
PyGhidra can also be used to run an existing Ghidra Python script directly in your native python interpreter
148-
using the `run_script()` command.
149-
However, while you can technically run an existing Ghidra script unmodified, you may
150-
run into issues due to differences between Jython 2 and CPython 3.
151-
Therefore, some modification to the script may be needed.
166+
### pyghidra.run_script()
167+
PyGhidra can also be used to run an existing Ghidra Python script directly in your native CPython
168+
interpreter using the `run_script()` function. However, while you can technically run an existing
169+
Ghidra script unmodified, you may run into issues due to differences between Jython 2 and
170+
CPython 3/JPype. Therefore, some modification to the script may be needed.
152171

153172
```python
173+
def run_script(
174+
binary_path: Optional[Union[str, Path]],
175+
script_path: Union[str, Path],
176+
project_location: Union[str, Path] = None,
177+
project_name: str = None,
178+
script_args: List[str] = None,
179+
verbose=False,
180+
analyze=True,
181+
lang: str = None,
182+
compiler: str = None,
183+
loader: Union[str, JClass] = None,
184+
*,
185+
install_dir: Path = None
186+
):
187+
"""
188+
Runs a given script on a given binary path.
189+
190+
:param binary_path: Path to binary file, may be None.
191+
:param script_path: Path to script to run.
192+
:param project_location: Location of Ghidra project to open/create.
193+
(Defaults to same directory as binary file if None)
194+
:param project_name: Name of Ghidra project to open/create.
195+
(Defaults to name of binary file suffixed with "_ghidra" if None)
196+
:param script_args: Command line arguments to pass to script.
197+
:param verbose: Enable verbose output during Ghidra initialization.
198+
:param analyze: Whether to run analysis, if a binary_path is provided, before running the script.
199+
:param lang: The LanguageID to use for the program.
200+
(Defaults to Ghidra's detected LanguageID)
201+
:param compiler: The CompilerSpecID to use for the program. Requires a provided language.
202+
(Defaults to the Language's default compiler)
203+
:param loader: The `ghidra.app.util.opinion.Loader` class to use when importing the program.
204+
This may be either a Java class or its path. (Defaults to None)
205+
:param install_dir: The path to the Ghidra installation directory. This parameter is only
206+
used if Ghidra has not been started yet.
207+
(Defaults to the GHIDRA_INSTALL_DIR environment variable)
208+
:raises ValueError: If the provided language, compiler or loader is invalid.
209+
:raises TypeError: If the provided loader does not implement `ghidra.app.util.opinion.Loader`.
210+
"""
211+
```
154212

213+
#### Example:
214+
```python
155215
import pyghidra
156216

157217
pyghidra.run_script(r"C:\input.exe", r"C:\some_ghidra_script.py")
@@ -163,11 +223,77 @@ This can also be done on the command line using `pyghidra`.
163223
> pyghidra C:\input.exe C:\some_ghidra_script.py <CLI ARGS PASSED TO SCRIPT>
164224
```
165225

166-
### Handling Package Name Conflicts
226+
### pyghidra.launcher.PyGhidraLauncher()
227+
JVM configuration for the classpath and vmargs may be done through a `PyGhidraLauncher`.
228+
229+
```python
230+
class PyGhidraLauncher:
231+
"""
232+
Base pyghidra launcher
233+
"""
234+
235+
def add_classpaths(self, *args):
236+
"""
237+
Add additional entries to the classpath when starting the JVM
238+
"""
239+
self.class_path += args
240+
241+
def add_vmargs(self, *args):
242+
"""
243+
Add additional vmargs for launching the JVM
244+
"""
245+
self.vm_args += args
246+
247+
def add_class_files(self, *args):
248+
"""
249+
Add additional entries to be added the classpath after Ghidra has been fully loaded.
250+
This ensures that all of Ghidra is available so classes depending on it can be properly loaded.
251+
"""
252+
self.class_files += args
253+
254+
def start(self, **jpype_kwargs):
255+
"""
256+
Starts Jpype connection to Ghidra (if not already started).
257+
"""
258+
```
259+
260+
The following `PyGhidraLauncher`s are available:
261+
262+
```python
263+
class HeadlessPyGhidraLauncher(PyGhidraLauncher):
264+
"""
265+
Headless pyghidra launcher
266+
"""
267+
```
268+
```python
269+
class DeferredPyGhidraLauncher(PyGhidraLauncher):
270+
"""
271+
PyGhidraLauncher which allows full Ghidra initialization to be deferred.
272+
initialize_ghidra must be called before all Ghidra classes are fully available.
273+
"""
274+
```
275+
```python
276+
class GuiPyGhidraLauncher(PyGhidraLauncher):
277+
"""
278+
GUI pyghidra launcher
279+
"""
280+
```
281+
282+
#### Example:
283+
```python
284+
from pyghidra.launcher import HeadlessPyGhidraLauncher
285+
286+
launcher = HeadlessPyGhidraLauncher()
287+
launcher.add_classpaths("log4j-core-2.17.1.jar", "log4j-api-2.17.1.jar")
288+
launcher.add_vmargs("-Dlog4j2.formatMsgNoLookups=true")
289+
launcher.start()
290+
```
167291

168-
There may be some Python modules and Java packages with the same import path. When this occurs the Python module takes precedence.
169-
While jpype has its own mechanism for handling this situation, PyGhidra automatically makes the Java package accessible by allowing
170-
it to be imported with an underscore appended to the package name.
292+
## Handling Package Name Conflicts
293+
There may be some Python modules and Java packages with the same import path. When this occurs the
294+
Python module takes precedence. While JPype has its own mechanism for handling this situation,
295+
PyGhidra automatically makes the Java package accessible by allowing it to be imported with an
296+
underscore appended to the package name:
171297

172298
```python
173299
import pdb # imports Python's pdb

0 commit comments

Comments
 (0)