pyopmnearwell.ml.ensemble module

“Run high-fidelity nearwell simulations in OPM-Flow for an ensemble of varying input arguments.

pyopmnearwell.ml.ensemble.calculate_WI(pressures: ndarray, injection_rates: float | ndarray) tuple[ndarray, list[int]]

Calculate the well index (WI) for a given dataset.

The well index (WI) is calculated using the following formula: .. math:

WI = \frac{q}{{p_w - p_{gb}}}
Note:
  • The unit of WI_array will depend on the units of pressures and injection_rates.

  • In 3D this might fail. The user is responsible to fix the array shapes before passing to this function.

Args:
pressures (np.ndarray): First axis are the ensemble members. Last axis is

assumed to be the x-axis. Must contain the well cells (i.e., well pressures values) at pressures[...,0].

injection_rates (float | np.ndarray): Injection rate. If an np.ndarray, it

must have shape broadcastable to pressures.shape.

Returns:
WI_array (numpy.ndarray): shape=(...,num_x_cells - 1)

An array of well index values for each data point in the dataset.

failed_indices (list[int]): Indices for the ensemble members where WI could not

be computed. E.g., if the simmulation went wrong and the pressure difference is zero.

Raises:

ValueError: If no data is found for the ‘pressure’ keyword in the dataset.

pyopmnearwell.ml.ensemble.calculate_radii(gridfile: Path, num_cells: int = 400, return_outer_inner: bool = False, triangle_grid: bool = False, angle: float = 1.0471975511965976) ndarray | tuple[ndarray, ndarray, ndarray]

Calculates the radii of the cells in a grid grom a given .

Args:

gridfile (str | pathlib.Path): Path to the file containing the grid. num_cells (int, optional): Number of cells in the grid. Defaults to 400. num_dims (int, optional): Number of dimensions of the grid. Defaults to 1. return_outer_inner (bool, optional): Whether to return the inner and outer radii

in addition to the average radii. Defaults to False.

triangle_grid (bool, optional): Whether the grid is a triangle grid. If

True, transform altitudes of the triangle grid to radii of a raidal grid with equal solution. Defaults to False.

angle (float, optional): Angle between both sides of the triangle grid. Defaults

to math.pi/3.

Returns:
np.ndarray | tuple[np.ndarray, np.ndarray, np.ndarray]: If return_outer_inner is

False, returns an array of the average radii of the cells.

If return_outer_inner is True, returns a tuple containing the array of average radii, the array of inner radii, and the array of outer radii.

Raises:
AssertionError: If the number of lines in the grid file is not equal to

num_cells + 1.

pyopmnearwell.ml.ensemble.create_ensemble(runspecs: dict[str, Any], efficient_sampling: list[str] | None = None, seed: int | None = None) list[dict[str, Any]]

Create an ensemble.

Note:
  • It is assumed that the user provides the variables in the correct units for pyopmnearwell.

  • If the variable name starts with "PERM" or "LOG", the distribution for random sampling is log uniform.

  • If the variable name starts with "INT", the distribution for random sampling is uniform on integers.

  • Else, the distribution for random sampling is uniformly distributed.

Args:
runspecs (dict[str, Any]): Dictionary with at least the following keys:
  • “npoints”: Maximum number of ensemble members.

  • “variables”: dict containing the run Args vary within the ensemble. The key specifies the variable name (needs to be identical to the variable name in the *.mako file that is passed to setup_ensemble) and the value is a tuple (min, max, npoints) with the min. and max. values for the variable and the number of samples generated for this variable. The samples are taken from a uniform distribution in the interval \([min, max]\).

  • “constants”: dict containing the run Args that are constant for all ensemble members.

efficient_sampling (Optional[list[str]], optional): List containing the names of

variables that should be sampled instead of fully meshed and then sampled. This is faster and avoids memory overload for higher dimensional combinations of variables. E.g., when creating an ensemble with varying vertical permeabilities. Only 10 layers with 10 samples generate a grid of 10^10 values. By sampling directly instead of generating the grid first, it is possible to deal with the complexity.

seed: (Optional[int]): Seed for the np.random.Generator. Is passed to

memory_efficient_sample as well. Default is None.

Note: The ensemble is generated as the cartesian product of all variable ranges. The

total number of ensemble members is thus the product of all individual npoints. If runspecs["npoints"] is lower than the product, a random sample of size runspecs["npoints"] of the full ensemble is returned.

Returns:
ensemble (list[dict[str, Any]]): List containing a dict with the specified

variable values for each ensemble member.

Raises:
ValueError: If runspecs["npoints"] is larger than the number of generated

ensemble members.

pyopmnearwell.ml.ensemble.extract_features(data: dict[str, Any], keywords: list[str], keyword_scalings: dict[str, float] | None = None) ndarray

Extract features from a run_ensemble run into a numpy array.

Note: The features are in the units used in *.UNRST and *.SMSPEC files. The user is responsible for transforming them to the units needed. This may depend on various factors such as the mode OPM will run in. In particular, pressure is in [bar]. Some units for common quantities: - Pressure: [bar] - Temperate: [°C] - Saturation: [unitless] - Time: []

Args:

data (dict[str, Any]): Data generated by run_ensemble. keywords (list[str]): Keywords to extract. The features will be in the order of the keywords. keyword_scalings (Optional[dict[str, float]]): Scalings for the features.

Returns:
feature_array: (numpy.ndarray): shape=(ensemble_size, num_report_steps, num_cells, num_features)

An array of input features for each data point in the dataset.

Raises:

ValueError: If no data is found for one of the keywords.

pyopmnearwell.ml.ensemble.get_flags(makofile: str | Path) str

Extract OPM Flow run flags from a makofile.

Args:

makofile (str | pathlib.Path): Path to the makofile.

Returns:

str: All flags that are passed to OPM Flow.

pyopmnearwell.ml.ensemble.integrate_fine_scale_value(radial_values: ndarray, radii: ndarray, block_sidelengths: float | ndarray, axis: int = -1) ndarray

Integrate a fine scale value across all radial cells covering a square grid block.

This function correctly takes only fractions of the integrated values for radial cells that are only partially inside the square block.

Args:

radial_values (np.ndarray): Cell values for the radial cells. radii (np.ndarray): Array of radii for inner and outer radius of the radial cells. Has to be ordered from low to high. block_sidelengths (float | np.ndarray): The sidelengths of the square grid

blocks. The length of this array determines the new length of the integrated axis. # TODO: Update the tests for this new functionality, i.e., that block_sidelengths determins the return shape.

axis (int): Axis to integrate along.

Returns:

float: The integrated value of fine-scale data.

Raise:

ValueError: If the radial cells do not cover the square grid block.

pyopmnearwell.ml.ensemble.memory_efficient_sample(variables: ndarray, num_members: int, seed: int | None = None) ndarray

Sample all variables individually.

Note: Requires that all variables arrays have the same length.

Args:

variables (np.ndarray), (shape=(num_variables, len_variables)): _description_ num_members (int): _description_ seed: (Optional[int]): Seed for the np.random.Generator. Default is

None.

Returns:

np.ndarray (shape=()):

pyopmnearwell.ml.ensemble.run_ensemble(flow_path: str | Path, ensemble_path: str | Path, runspecs: dict[str, Any], ecl_keywords: list[str], init_keywords: list[str], summary_keywords: list[str], num_report_steps: int | None = None, keep_result_files: bool = False, **kwargs) dict[str, Any]

Run OPM Flow for each ensemble member and store data.

Note: The initial time step (i.e., t=0) is always disregarded.

Args:

flow_path (str | pathlib.Path): _description_ ensemble_path (str | pathlib.Path): _description_ runspecs (dict[str, Any]): _description_ ecl_keywords (list[str]): _description_ init_keywords (list[str]): _description_ summary_keywords (list[str]): _description_ num_report_steps (Optional[int], optional): Disregard an ensemble simulation if

it did not run to the last report step. Defaults to None.

keep_result_files (bool): Keep result files of all ensemble members, not

only the first one. Defaults to False.

**kwargs: Possible parameters are:
  • step_size_time (int): Save data only for every step_size_time report step. Default is 1.

  • step_size_cell (int): Save data only for every step_size_cell grid cell. Default is 1.

  • flags (str): Flags to run OPM Flow with.

Returns:

dict[str, Any]: _description_

pyopmnearwell.ml.ensemble.setup_ensemble(ensemble_path: str | Path, ensemble: list[dict[str, Any]], makofile: str | Path, **kwargs) None

Create a deck file for each ensemble member.

Args:

ensemble_path (str | pathlib.Path): The path to the ensemble directory. ensemble (list[dict[str, Any]]): A list of dictionaries containing the

parameters for each ensemble member. Usually generated by create_ensemble.

makofile (str | pathlib.Path): The path to the Mako template file for the

pyopmnearwell deck for ensemble members.

**kwargs: kwargs are passed to reservoir_files. Possible kwargs are:
  • recalc_grid (bool, optional): Whether to recalculate GRID.INC for each

    ensemble member. Defaults to False.

  • recalc_tables (bool, optional): Whether to recalculate TABLES.INC for

    each ensemble member. Defaults to False.

  • recalc_sections (bool, optional): Whether to recalculate GEOLOGY.INC

    and REGIONS.INC for each ensemble member. Defaults to False.

Raises:

Exception: If there is an error rendering the Mako template.

Returns:

None

pyopmnearwell.ml.ensemble.store_dataset(features: ndarray, targets: ndarray, savepath: str | Path) Path

Store a TensorFlow dataset given by to tensors.

Args:

features (np.ndarray): Features of the dataset. targets (np.ndarray): Targets of the dataset. savepath (str | pathlib.Path): Folder where the dataset should be saved.

Returns:

pathlib.Path: Savepath of the dataset