Metadata-Version: 2.4
Name: expressionizer
Version: 0.8.1
Summary: A Python library for symbolic math expressions and evaluation.
Author-email: Hudson Gouge <hudson.gouge@icloud.com>
License-Expression: MIT
Project-URL: Homepage, https://pypi.org/project/expressionizer/
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: numpy

# Expressionizer

Expressionizer is a Python library for symbolic math expression building, simplification, and step-by-step evaluation output.

It is designed for apps and tools that need explainable algebraic transformation, not just final answers. Typical use cases include math tutoring workflows, educational software, expression debugging, and generating human-readable solution traces.

## Pre-Stable Status

Expressionizer is currently in a **pre-stable (0.x)** phase.

- The core API is usable and actively developed.
- Some interfaces and behavior may change between minor versions.
- If you are using this in production, pin an exact version.

## Why Expressionizer

- Build symbolic expressions in Python.
- Evaluate with variable substitutions.
- Generate step-by-step simplification traces.
- Render expressions as plain text and LaTeX.
- Support structured expression types like equations and inequalities.
- Include procedural expression generation utilities.

## Features

- **Symbolic expression tree primitives**
  - `Symbol`, `Power`, `Product`, `Sum`, `FunctionCall`
  - `Equation` and `InEquality` data structures
- **Convenience constructors**
  - `symbol(...)`, `sum(...)`, `product(...)`, `power(...)`, `fraction(...)`
- **Expression normalization and simplification**
  - Combines numeric terms/factors
  - Merges powers and repeated structures where possible
- **Step-by-step evaluation engine**
  - `evaluate(...)` returns both the result and evaluation context
  - Context tracks snapshots and can render explanation output
  - Includes decomposition-based arithmetic steps for larger operations
- **Configurable evaluator behavior**
  - Limits and precision controls via `EvaluatorOptions`
  - Approximation and bounds behavior for very small/large numbers
- **Rendering**
  - Plain text rendering with `render(...)`
  - LaTeX rendering with `render_latex(...)`
  - Expression tree inspection with `render_type(...)`
- **Function evaluation support**
  - Works with substitutions for variables and callables
  - Includes common math function support through procedural helpers
- **Procedural generation utilities**
  - Random variable name generation
  - Random number generation with constraints
  - Weighted random expression generation for testing/content generation
  - Optional calculus generation controls (`allow_calculus`, `difficulty`, `guarantee_solvable`)

## Installation

```bash
pip install expressionizer
```

## Quick Start

```python
from expressionizer import symbol, sum, power, evaluate, render_latex

x = symbol("x")
expr = power(sum([x, 2]), 2)

result, context = evaluate(expr, substitutions={"x": 3})

print("Result:", result)
print("Expression (LaTeX):", render_latex(expr))
print(context.render())
```

Output from a real run:

```text
Result: 25
Expression (LaTeX): (2 + x)^2
## Step 1
Substitute $x = 3$:
$$(2 + x)^2 \\
= (2 + 3)^2 \\
= 5^2$$

## Step 2
$$5^2 \\
= 5(5) \\
= 25$$
```

## Real Examples (Generated by Expressionizer)

### 1) Symbolic multiplication with substitution

```python
from expressionizer import symbol, sum, product, evaluate

x = symbol("x")
expr = product([sum([x, 4]), sum([x, 1])])

result, context = evaluate(expr, substitutions={"x": 5})

print(result)
print(context.render())
```

```text
54
## Step 1
Substitute $x = 5$:
$$(4 + x)(1 + x) \\
= (4 + 5)(1 + 5) \\
= 9(1 + 5)$$

## Step 2
$$9(1 + 5) \\
= 9(6)$$

## Step 3
$$9(6) \\
= 54$$
```

### 2) Decimal decomposition and place-value addition

```python
from expressionizer import Sum, evaluate

expr = Sum([4, 7.90623])
result, context = evaluate(expr)

print(result)
print(context.render())
```

```text
11.90623
Let's break $4$ and $7.90623$ down into their components.
$$4 + 7.90623 \\
= 4 + 7 + 0.9 + 0.006 + 0.0002 + 0.00003$$

[aligned place-value rows]
4.00000
7.00000
0.90000
0.00600
0.00020
0.00003

$10^{-5}$: $3 + 0 + 0 + 0 + 0 + 0 = 3$
$10^{-4}$: $0 + 2 + 0 + 0 + 0 + 0 = 2$
$10^{-3}$: $0 + 0 + 6 + 0 + 0 + 0 = 6$
$10^{-1}$: $0 + 0 + 0 + 9 + 0 + 0 = 9$
$10^{0}$: $0 + 0 + 0 + 0 + 7 + 4 = 11$, carry the 1.
$10^{1}$: 1 (carried)
Putting it together, we get $11.90623$.
$$ 4 + 7 + 0.9 + 0.006 + 0.0002 + 0.00003 = 11.90623 $$
```

`context.render()` returns a formatted explanation sequence you can display in apps, notebooks, or web UIs.

## Core API Overview

- `evaluate(expression, substitutions={}, error_on_invalid_snap=True, options=None)`
  - Returns `(result, context)`
- `compact_evaluator_options(...)`
  - Returns a compact preset for shorter explanations and lower token cost
- `render(expression, group=False)`
  - Plain text expression rendering
- `render_latex(expression, renderOptions=...)`
  - LaTeX rendering for display and documentation
- Constructors:
  - `symbol(name)`
  - `sum(terms)`
  - `product(factors)`
  - `power(base, exponent)`
  - `fraction(numerator, denominator)`
  - `derivative(expression, variable, order=1)`
  - `partial_derivative(expression, variables)`
  - `integral(expression, variable, lower=None, upper=None)`

## Calculus Coverage Notes

Expressionizer includes a native rule-based calculus engine for derivatives and integrals, including multivariate differentiation and definite/indefinite integrals.

- Coverage is strong for common educational forms (polynomials, many trig/exp/log forms, product/chain/power rules).
- Some advanced integrals and non-elementary forms will remain symbolic (by design) rather than returning incorrect simplifications.
- For procedural generation, prefer `guarantee_solvable=True` when you need high reliability for auto-generated calculus problems.
- The evaluator now exposes solve metadata (`solve_status`, `reason_code`, coverage tags, explanation events) so you can filter low-confidence outputs in training pipelines.

## Explanation Customization

You can tune both evaluator behavior and wording style without changing defaults:

```python
from expressionizer import EvaluatorOptions, WordingOptions, evaluate

result, context = evaluate(
    expr,
    substitutions=substitutions,
    options=EvaluatorOptions(
        wording_style="concise",
        wording_options=WordingOptions(step_heading_template="### Phase {number}"),
    ),
)
```

For full per-call wording customization (including per-generation language/profile swaps),
use `ExplanationProfile`:

```python
from expressionizer import EvaluatorOptions, ExplanationProfile, evaluate

result, context = evaluate(
    expr,
    substitutions=substitutions,
    options=EvaluatorOptions(
        explanation_profile=ExplanationProfile(
            locale="en",
            missing_key_policy="fallback",
            message_overrides={
                "step.heading": "## Phase {number}",
                "equation.unsupported_equation_arity": "Unsupported equation format.",
            },
        )
    ),
)
```

`ExplanationProfile` fields:

- `locale`: logical locale tag (used for per-call routing and future locale packs)
- `style_type`: built-in style overlay (`default|compact|plain|xml`)
- `missing_key_policy`: `fallback|marker|error` for unresolved localization keys
- `message_overrides`: key-based template overrides (`key -> string`)
- `exact_text_overrides`: exact-string overrides for legacy/unkeyed text (`text -> string`)

Defaults are production-safe:

- if you do nothing, built-in evaluator/equation templates render normally
- if you override only a few keys, only those keys change
- set `missing_key_policy="error"` to require every encountered key be supplied in `message_overrides`
- set `missing_key_policy="fallback"` (default) to use built-in safe defaults when a key is not overridden
- set `missing_key_policy="marker"` to surface unresolved keys as `[[key]]` while debugging

Built-in locale packs currently include:

- `en`
- `es`
- `fr`
- `de`
- `ko`
- `he`
- `he-niqqud`

Changing language per generation:

```python
from expressionizer import EvaluatorOptions, ExplanationProfile, evaluate

# One call in Spanish
_, es_context = evaluate(
    expr,
    substitutions=substitutions,
    options=EvaluatorOptions(
        explanation_profile=ExplanationProfile(locale="es")
    ),
)

# Another call in Korean (same expression, different language)
_, ko_context = evaluate(
    expr,
    substitutions=substitutions,
    options=EvaluatorOptions(
        explanation_profile=ExplanationProfile(locale="ko")
    ),
)
```

For dataset generation, pick locale per sample:

```python
import random
from expressionizer import EvaluatorOptions, ExplanationProfile, evaluate, supported_locales

candidate_locales = [loc for loc in supported_locales() if loc != "en"]

def render_one(expr, substitutions):
    locale = random.choice(candidate_locales)
    _, context = evaluate(
        expr,
        substitutions=substitutions,
        options=EvaluatorOptions(
            explanation_profile=ExplanationProfile(locale=locale)
        ),
    )
    return locale, context.render()
```

Language selection works the same way for equation solving:

```python
from expressionizer import EquationWordingOptions, ExplanationProfile, solve_equation

solution, eq_context = solve_equation(
    equation,
    wording_options=EquationWordingOptions(
        explanation_profile=ExplanationProfile(locale="he-niqqud")
    ),
)
```

Built-in locale packs are validated for:

- placeholder safety (no missing/extra formatting placeholders),
- no accidental English leakage for built-in keys in non-English packs.

You can customize formatting structure too (not just wording), including heading style,
step wrappers, separators, and line breaks:

```python
from expressionizer import EvaluatorOptions, ExplanationProfile, evaluate

result, context = evaluate(
    expr,
    substitutions=substitutions,
    options=EvaluatorOptions(
        explanation_profile=ExplanationProfile(
            message_overrides={
                "step.heading": "Step {number}",
                "render.newline": " | ",
                "render.step.block": "[{heading}] {body}",
                "render.step.joiner": " || ",
            }
        )
    ),
)
```

Template layering is supported with `{{other.key}}` references:

```json
{
  "render.step.open": "<s{number}>",
  "render.step.close": "</s{number}>",
  "render.step.block": "{{render.step.open}}{newline}{body}{newline}{{render.step.close}}"
}
```

Built-in style overlays can be selected per call:

```python
from expressionizer import EvaluatorOptions, ExplanationProfile, evaluate

_, context = evaluate(
    expr,
    substitutions=substitutions,
    options=EvaluatorOptions(
        explanation_profile=ExplanationProfile(
            locale="es",
            style_type="xml",
        )
    ),
)
```

Language-pack quality tooling:

- `python -m expressionizer.localization_catalog --output localization_keys.json`
  - Generates a canonical key catalog for translators and reviewers.
- `python -m expressionizer.localization_catalog --validate-only`
  - Validates built-in locale packs against the full catalog, including runtime fallback gaps.
- `python -m expressionizer.localization_catalog --validate-only --print-coverage`
  - Prints per-locale key coverage summary using the catalog.
- `python -m expressionizer.localization_catalog --validate-only --require-full-locale-coverage`
  - Fails validation if any non-English locale still falls back to English for tracked keys.

For long numeric arithmetic, you can enable/configure calculator mode. This is useful for
student-friendly readability and for AI training-data workflows that need tool-call placeholders:

```python
from expressionizer import CalculatorModeOptions, EvaluatorOptions, evaluate

result, context = evaluate(
    expr,
    substitutions=substitutions,
    options=EvaluatorOptions(
        calculator_mode=CalculatorModeOptions(
            enabled=True,
            multiplication_operand_complexity_threshold=6,
            result_complexity_threshold=20,
            template="<tool_call name='calculator' op='{operation}' expr='{expression}' />\nResult: {result}",
        )
    ),
)
```

Available calculator placeholders include:

- `{operation}`: `addition`, `subtraction`, `multiplication`, or `power`
- `{expression}`: rendered expression
- `{result}`: rendered final value
- `{lhs}`, `{rhs}`: left/right operands for binary operations

Default behavior now includes calculator mode with practical thresholds so large numeric
arithmetic does not produce excessively long place-value traces.

### Choosing Profiles

Expressionizer now supports two practical generation profiles:

- `realistic` (default): intended for user-facing outputs and classroom-style workflows
  - tighter expression complexity bounds
  - more predictable explanation length
- `stress`: intended for robustness/fuzz testing
  - broader expression shapes and higher complexity ceilings

Use `realistic` for production user experiences, and `stress` for QA pipelines.

You can also control whether generated expression problems are solvable:

- `solvability_mode="solvable"`: intentionally avoids injected impossible/domain-invalid cases
- `solvability_mode="mixed"` (default): mostly solvable, with occasional intentionally unsolvable problems
- `solvability_mode="unsolvable"`: intentionally generates impossible/domain-invalid problems for negative examples

For `mixed`, use `unsolvable_probability` to tune the share of impossible cases, and
`hard_problem_probability` to occasionally promote a realistic case into a harder-but-still-bounded one.

### API Stability Note

For `0.8.x`, these configuration surfaces are intended to remain stable:

- `EvaluatorOptions`
- `WordingOptions`
- `CalculatorModeOptions`
- `generate_random_expression(..., generation_profile=...)`
- `generate_random_expression(..., solvability_mode=..., unsolvable_probability=..., hard_problem_probability=...)`
- `ExplanationProfile` (per-call localization/customization layer)
- CLI flags: `--generation-profile`, `--solvability-mode`, `--unsolvable-probability`, `--hard-problem-probability`, `--wording-style`, `--compact-explanations`, `--step-heading-template`, `--locale`, `--messages-file`, `--exact-text-overrides-file`

For compact output presets:

```python
from expressionizer import compact_evaluator_options, evaluate

result, context = evaluate(
    expr,
    substitutions=substitutions,
    options=compact_evaluator_options(step_heading_template="### Step {number}"),
)
```

CLI tools also support:

- `--wording-style verbose|concise`
- `--compact-explanations`
- `--step-heading-template "### Phase {number}"`
- `--locale en`
- `--style-type default|compact|plain|xml`
- `--messages-file path/to/messages.json`
- `--exact-text-overrides-file path/to/exact_text_overrides.json`
- `--generation-profile realistic|stress`
  - `realistic` is user-facing and avoids extreme expression blowups
  - `stress` is broader/extreme for robustness testing
- `--solvability-mode mixed|solvable|unsolvable`
  - `mixed` is best for real-world distributions with some impossible examples
  - `solvable` is best for student practice datasets where every case should resolve
  - `unsolvable` is best for negative datasets teaching solvability detection
- `--unsolvable-probability 0.12`
  - used when `--solvability-mode mixed`
- `--hard-problem-probability 0.2`
  - occasionally escalates a realistic problem into a harder variant

## Quality and Audit Utilities

- `python -m expressionizer.procedural_test ...`
  - Stress-tests generated expressions/equations until a failure or timeout
- `python -m expressionizer.explanation_audit ...`
  - Audits explanation consistency and optional SymPy equivalence
- `python -m expressionizer.manual_review_cases --cases 40 --generation-profile realistic`
  - Generates user-facing manual-review samples with safer default complexity
- `python -m expressionizer.manual_review_cases --cases 40 --solvability-mode mixed --unsolvable-probability 0.15`
  - Generates mixed datasets with both solvable and intentionally impossible cases
- `python -m expressionizer.equation_manual_review_cases --cases 40`
  - Generates a markdown set of equation/system explanations for manual review
  - Supports localization overrides with `--locale`, `--messages-file`, and `--exact-text-overrides-file`
- `python -m expressionizer.localization_catalog --validate-only`
  - Validates locale packs against catalog coverage + placeholder consistency
- `python release_smoke.py`
  - Runs a quick pre-release local smoke check

## Compatibility

- Python `>=3.8`
- OS independent

## Roadmap

As a pre-stable library, near-term improvements are focused on:

- API stabilization toward `1.0`
- Expanded test coverage
- Improved docs and examples
- Continued refinement of step-by-step output quality

## SEO Keywords

Python symbolic math library, step-by-step math solver, algebra expression evaluator, LaTeX math renderer, expression simplification engine, educational math software backend.

## Contributing

Issues and pull requests are welcome. If you report a bug, include:

- the expression
- substitutions used
- expected behavior
- actual behavior and rendered steps

## License

MIT
