.. _MJW: ==================== MuJoCo Warp (MJWarp) ==================== .. toctree:: :hidden: API MuJoCo Warp (MJWarp) is an implementation of MuJoCo written in `Warp `__ and optimized for `NVIDIA `__ hardware and parallel simulation. MJWarp lives in the `google-deepmind/mujoco_warp `__ GitHub repository and is currently in beta. MJWarp is developed and maintained as a joint effort by `NVIDIA `__ and `Google DeepMind `__. .. TODO: remove after release .. admonition:: Beta software :class: warning - MJWarp is beta software and is under active development. - MJWarp developers will triage and respond to `bug reports and feature requests `__. - MJWarp is mostly feature complete but requires performance optimization, documentation, and testing. - The intended audience during Beta are physics engine enthusiasts and learning framework integrators. .. _MJW_install: Installation ============ The beta version of MuJoCo Warp is installed from GitHub. Please note that the beta version of MuJoCo Warp does not support all versions of MuJoCo, Warp, CUDA, NVIDIA drivers, etc. .. code-block:: shell git clone https://github.com/google-deepmind/mujoco_warp.git cd mujoco_warp python3 -m venv env source env/bin/activate pip install --upgrade pip pip install uv uv pip install -e .[dev,cuda] Test the Installation .. code-block:: shell pytest .. _MJW_Usage: Basic Usage =========== Once installed, the package can be imported via ``import mujoco_warp as mjw``. Structs, functions, and enums are available directly from the top-level :mod:`mjw ` module. Structs ------- Before running MJWarp functions on an NVIDIA GPU, structs must be copied onto the device via :func:`mjw.put_model ` and :func:`mjw.make_data ` or :func:`mjw.put_data ` functions. Placing an :ref:`mjModel` on device yields an :class:`mjw.Model `. Placing an :ref:`mjData` on device yields an :class:`mjw.Data `. .. code-block:: python mjm = mujoco.MjModel.from_xml_string("...") mjd = mujoco.MjData(mjm) m = mjw.put_model(mjm) d = mjw.put_data(mjm, mjd) These MJWarp variants mirror their MuJoCo counterparts but have a few key differences: #. :class:`mjw.Model ` and :class:`mjw.Data ` contain Warp arrays that are copied onto device. #. Some fields are missing from :class:`mjw.Model ` and :class:`mjw.Data ` for features that are unsupported. Batch sizes ----------- MJWarp is optimized for parallel simulation. A batch of simulations can be specified with three parameters: - :attr:`nworld `: Number of worlds to simulate. - _`nconmax`: Expected number of contacts per world. The maximum number of contacts for all worlds is ``nconmax * nworld``. - _`naconmax`: Alternative to `nconmax`_, maximum number of contacts over all worlds. If `nconmax`_ and `naconmax`_ are both set then `nconmax`_ is ignored. - _`njmax`: Maximum number of constraints per world. .. admonition:: Semantic difference for `nconmax`_ and `njmax`_. :class: note It is possible for the number of contacts per world to exceed `nconmax`_ if the total number of contacts for all worlds does not exceed ``nworld x nconmax``. However, the number of constraints per world is strictly limited by `njmax`_. .. admonition:: XML parsing :class: note Values for `nconmax`_ and `njmax`_ are not parsed from :ref:`size/nconmax ` and :ref:`size/njmax ` (these parameters are deprecated). Values for these parameters must be provided to :func:`mjw.make_data ` or :func:`mjw.put_data `. Functions --------- MuJoCo functions are exposed as MJWarp functions of the same name, but following `PEP 8 `__-compliant names. Most of the :ref:`main simulation ` and some of the :ref:`sub-components ` for forward simulation are available from the top-level :mod:`mjw ` module. Minimal example --------------- .. code-block:: python # Throw a ball at 100 different velocities. import mujoco import mujoco_warp as mjw import warp as wp _MJCF=r""" """ mjm = mujoco.MjModel.from_xml_string(_MJCF) m = mjw.put_model(mjm) d = mjw.make_data(mjm, nworld=100) # initialize velocities wp.copy(d.qvel, wp.array([[float(i) / 100, 0, 0, 0, 0, 0] for i in range(100)], dtype=float)) # simulate physics mjw.step(m, d) print(f'qpos:\n{d.qpos.numpy()}') .. _mjwCLI: Command line scripts -------------------- Benchmark an environment with testspeed .. code-block:: shell mjwarp-testspeed benchmark/humanoid/humanoid.xml Interactive environment simulation with MJWarp .. code-block:: shell mjwarp-viewer benchmark/humanoid/humanoid.xml Feature Parity ============== MJWarp supports most of the main simulation features of MuJoCo, with a few exceptions. MJWarp will raise an exception if asked to copy to device an :ref:`mjModel` with field values referencing unsupported features. The following features are **not supported** in MJWarp: .. list-table:: :width: 90% :align: left :widths: 2 5 :header-rows: 1 * - Category - Feature * - :ref:`Equality ` - ``FLEX`` * - :ref:`Integrator ` - ``IMPLICIT``, ``IMPLICITFAST`` not supported with fluid drag * - :ref:`Solver ` - ``PGS``, ``noslip``, :ref:`islands ` * - Fluid Model - :ref:`flEllipsoid` * - :ref:`Sensors ` - ``GEOMDIST``, ``GEOMNORMAL``, ``GEOMFROMTO`` * - Flex - ``VERTCOLLIDE=false``, ``INTERNAL=true``, ``nflex > 1`` * - Jacobian format - ``SPARSE`` * - Option - :ref:`contact override ` * - Plugins - ``All`` except ``SDF`` * - :ref:`User parameters ` - ``All`` .. _mjwPerf: Performance Tuning ================== The following are considerations for optimizing the performance of MJWarp. .. _mjwGC: Graph capture ------------- MJWarp functions, for example :func:`mjw.step `, often comprise a collection of kernel launches. Warp will launch these kernels individually if the function is called directly. To improve performance, especially if the function will be called multiple times, it is recommended to capture the operations that comprise the function as a CUDA graph .. code-block:: python with wp.ScopedCapture() as capture: mjw.step(m, d) The graph can then be launched or re-launched .. code-block:: python wp.capture_launch(capture.graph) and will typically be significantly faster compared to calling the function directly. Please see the `Warp Graph API reference `__ for details. Batch sizes ----------- The maximum numbers of contacts and constraints, `nconmax`_ / `naconmax`_ and `njmax`_ respectively, are specified when creating :class:`mjw.Data ` with :func:`mjw.make_data ` or :func:`mjw.put_data `. Memory and computation scales with the values of these parameters. For best performance, the values of these parameters should be set as small as possible while ensuring the simulation does not exceed these limits. It is expected that good values for these limits will be environment specific. In practice, selecting good values typically involves trial-and-error. :func:`mjwarp-testspeed ` with the flag `--measure_alloc` for printing the number of contacts and constraints at each simulation step and interacting with the simulation via :func:`mjwarp-viewer ` and checking for overflow errors can both be useful techniques for iteratively testing values for these parameters. Solver iterations ----------------- MuJoCo's default solver settings for the maximum numbers of :ref:`solver iterations` and :ref:`linesearch iterations` are expected to provide reasonable performance. Reducing MJWarp's settings :attr:`Option.iterations ` and/or :attr:`Option.ls_iterations ` limits may improve performance and should be secondary considerations after tuning `nconmax`_ / `naconmax`_ and `njmax`_. Reducing these limits too much may prevent the constraint solver from converging and can lead to inaccurate or unstable simulation. .. admonition:: Impact on Performance: MJX (JAX) and MJWarp :class: note In :ref:`MJX` these solver parameters are key for controlling simulation performance. With MJWarp, in contrast, once all worlds have converged the solver can early exit and avoid unnecessary computation. As a result, the values of these settings have comparatively less impact on performance. Contact sensor matching ----------------------- Scenes that include :ref:`contact sensors` have a parameter that specifies the maximum number of matched contacts per sensor :attr:`Option.contact_sensor_max_match `. For best performance, the value of this parameter should be as small as possible while ensuring the simulation does not exceed the limit. Matched contacts that exceed this limit will be ignored. The value of this parameter can be set directly, for example ``model.opt.contact_sensor_maxmatch = 16``, or via an XML custom numeric field .. code-block:: xml Similar to the maximum numbers of contacts and constraints, a good value for this setting is expected to be environment specific. :func:`mjwarp-testspeed ` and :func:`mjwarp-viewer ` may be useful for tuning the value of this parameter. Parallel linesearch ------------------- In addition to the constraint solver's iterative linesearch, MJWarp provides a parallel linesearch routine that evaluates a set of step sizes in parallel and selects the best one. The step sizes are spaced logarithmically from :attr:`Model.opt.ls_parallel_min_step ` to 1 and the number of step sizes to evaluate is set via :attr:`Model.opt.ls_iterations `. In some cases the parallel routine may provide improved performance compared to the constraint solver's default iterative linesearch. To enable this routine set ``Model.opt.ls_parallel=True`` or add a custom numeric field to the XML .. code-block:: xml .. admonition:: Experimental feature :class: note The parallel linesearch is currently an experimental feature. Batched :class:`Model ` Fields ================================================= To enable batched simulation with different model parameter values, many :class:`mjw.Model ` fields have a leading batch dimension. By default, the leading dimension is 1 (i.e., ``field.shape[0] == 1``) and the same value(s) will be applied to all worlds. It is possible to override one of these fields with a ``wp.array`` that has a leading dimension greater than one. This field will be indexed with a modulo operation of the world id and batch dimension: ``field[worldid % field.shape[0]]``. Importantly, the field shape should be overridden prior to :ref:`graph capture ` (i.e., ``wp.ScopedCapture``) .. code-block:: python # override shape and values m.dof_damping = wp.array([[0.1], [0.2]], dtype=float) with wp.ScopedCapture() as capture: mjw.step(m, d) It is possible to override the field shape and set the field values after graph capture .. code-block:: python # override shape m.dof_damping = wp.empty((2, 1), dtype=float) with wp.ScopedCapture() as capture: mjw.step(m, d) # set batched values dof_damping_batch = wp.array([[0.1], [0.2]], dtype=float) wp.copy(m.dof_damping, dof_damping_batch) # m.dof = dof_damping_batch will not update the captured graph .. admonition:: Heterogeneous worlds :class: note Heterogeneous worlds, for example: per-world meshes or number of degrees of freedom, are not currently available. .. _mjwFAQ: Frequently Asked Questions ========================== Learning frameworks ------------------- **Does MJWarp work with JAX?** Yes. MJWarp is interoperable with `JAX `__. Please see the `Warp Interoperability `__ documentation for details. Additionally, :ref:`MJX ` provides a JAX API for a subset of MJWarp's :doc:`API `. The backend is specified with ``impl='warp'``. **Does MJWarp work with PyTorch?** Yes. MJWarp is interoperable with `PyTorch `__. Please see the `Warp Interoperability `__ documentation for details. **How to train policies with MJWarp physics?** For examples that train policies with MJWarp physics, please see: - `Isaac Lab `__: Train via `Newton API `__. - `mjlab `__: Train directly with MJWarp using PyTorch. - `MuJoCo Playground `__: Train via :ref:`MJX API `. Features -------- **Is MJWarp differentiable?** No. MJWarp is not currently differentiable via Warp's `automatic differentiation `__ functionality. Updates from the team related to enabling automatic differentiation for MJWarp are tracked in this `GitHub issue `__. **Does MJWarp work with multiple GPUs?** Yes. Warp's ``wp.ScopedDevice`` enables multi-GPU computation .. code-block:: python # create a graph for each device graph = {} for device in wp.get_cuda_devices(): with wp.ScopedDevice(device): m = mjw.put_model(mjm) d = mjw.make_data(mjm) with wp.ScopedCapture(device) as capture: mjw.step(m, d) graph[device] = capture.graph # launch a graph on each device for device in wp.get_cuda_devices(): wp.capture_launch(graph[device]) Please see the `Warp documentation `__ for details and `mjlab distributed training `__ for a reinforcement learning example. **Is MJWarp on GPU deterministic?** No. There may be ordering or *small* numerical differences between results computed by different executions of the same code. This is characteristic of non-deterministic atomic operations on GPU. Set device to CPU with ``wp.set_device("cpu")`` for deterministic results. Developments for deterministic results on GPU are tracked in this `GitHub issue `__. **How are orientations represented?** Orientations are represented as unit quaternions and follow :ref:`MuJoCo's conventions`: ``w, x, y, z`` or ``scalar, vector``. .. admonition:: ``wp.quaternion`` :class: note MJWarp utilizes Warp's `built-in type `__ ``wp.quaternion``. Importantly however, MJWarp does not utilize Warp's ``x, y, z, w`` quaternion convention or operations and instead implements quaternion routines that follow MuJoCo's conventions. Please see `math.py `__ for the implementations.