Introduction

Chemical risk assessment is no longer a task reserved for large companies or chemical manufacturers.

Japan's new chemical substance regulations put much more weight on autonomous chemical management. Workplaces that manufacture, handle, or supply risk-assessment target substances must appoint chemical substance managers, and employers must minimize worker exposure to those substances.

Chemiguide explains that the regulated substance scope reaches roughly 2,900 substances around April 2026. At the same time, Chemisapo's target-substance page currently lists downloadable files at about 2,300 substances for the April 2025 / April 2026 enforcement set and about 2,500 substances for April 2027. The exact operational list depends on the enforcement date and source file, but the practical point is clear: the list is already too large for comfortable manual Excel matching, and it will keep changing.

Overview of the chemical risk assessment OSS pipeline: public data becomes SQLite and Python libraries, then agent-facing MCP tools.

Scope of this article

The libraries described here are not official tools from MHLW or related agencies. Final compliance decisions should be checked against official documents, SDSs, qualified experts, and the relevant labour standards office. The goal here is to make public information and public methodology easier to use as verifiable code.

Why Risk Assessment Matters Now

MHLW's chemical risk assessment page explains that there are more than 400 work-related chemical accidents with at least four lost workdays every year, and that these accidents occur not only in chemical manufacturing but across a wider set of industries and work types. Many have involved substances outside the older special ordinance framework.

That is the real reason this matters. Chemical management is not only a laboratory or factory problem. Cleaning work, food service, medical and welfare facilities, construction, dry cleaning, retail, schools, and warehouses all use products containing chemical substances.

But running a risk assessment in the field is difficult.

You need to know whether the substance is legally in scope.
You need to read ingredient names, CAS numbers, content percentages, and GHS classifications from SDSs.
You need to check occupational safety law, special ordinances, PRTR, poison control, CSCL, and other legal frameworks.
You need to model work conditions and estimate inhalation, dermal, and physical hazards.
You need to use the result to decide which controls should be prioritized.

Large companies can often absorb that work with dedicated departments or paid systems. Small and medium-sized workplaces often cannot.

CREATE-SIMPLE Is Good. That Is Why I Want It to Be Machine-Readable

One of the most important public tools for this problem is CREATE-SIMPLE, published through MHLW's workplace safety site.

CREATE-SIMPLE is a simplified chemical risk assessment tool for a wide range of workplaces, including service industries and research/testing facilities. It compares an exposure limit or GHS-based management target concentration with an estimated exposure concentration derived from work conditions. It covers inhalation exposure, dermal absorption, and physical hazards. The workplace safety support page currently lists CREATE-SIMPLE_ver3.2, updated in March 2026.

I think CREATE-SIMPLE is a strong public effort. It is distributed as an Excel workbook, and the design criteria and manuals are public. It is free for workplaces to use. That matters.

The gap appears when you want to call it from an internal database, an SDS management workflow, or an AI agent.

A workbook-centered design is hard to invoke mechanically from another system.
Substance screening, legal lookup, calculation, and control comparison often become separate manual tasks.
After estimating risk, it is hard to programmatically compare which control lowers the risk and by how much.
Updated substance lists and GHS classifications still need to be tracked by people.

So I started separating the public data and public methodology into Python libraries, SQLite databases, and MCP tools.

The Overall Shape

There are three core repositories.

Repository	Role	Main artifact
`risk_assessment_list`	Check whether a substance is subject to risk assessment obligations or GHS notice handling	SQLite + Python API
`ra-law-db`	Search Japanese chemical regulations by CAS number	SQLite + Python API
`ra-library`	Calculate, explain, and recommend controls using public methodology references	Python API

For agent and app integration, there are thin MCP layers: ra-mcp and ra-law-mcp. The domain logic stays in the libraries; MCP is just the structured interface.

risk_assessment_list

Ingests public MHLW, JOHAS, and NITE data to decide whether a substance is in the risk-assessment obligation set or has GHS notice flags.

screening

1. risk_assessment_list: Is This Substance in Scope?

The first practical question is simple: is this substance even in scope?

The relevant public data exists across MHLW target lists, JOHAS/Chemisapo guidance, and NITE GHS classifications. In actual workplaces, that often turns into downloading spreadsheets and matching names or CAS numbers manually.

risk_assessment_list turns that step into a library.

Python


from risk_assessment_list import evaluate_substance, search_substances

candidates = search_substances("ホルムアルデヒド")

result = evaluate_substance("50-00-0")
print(result.legal_ra_required)
print(result.ghs_notice_required)
print(result.ghs_pictograms)

The repository README describes a workflow that converts MHLW obligation lists and NITE GHS classifications into a bundled SQLite database. It exposes exact lookup, synonym-aware fuzzy candidate search, and mixture evaluation APIs. Search handles Japanese and English names, aliases, Greek-letter variants, and common abbreviations.

The important design choice is that search and legal decision-making are separated. Search returns candidates; the legal decision is made from CAS/list matching. Without that separation, fuzzy search convenience can easily become false legal certainty.

2. ra-law-db: Put Law Screening in SQLite

Knowing that a substance is in scope is not enough.

The same CAS number may be relevant to occupational safety law, special ordinances, PRTR, CSCL, poison control, and other frameworks. SDS section 15 helps, but if you want this in a system, you need structured data.

ra-law-db publishes Japanese chemical law screening data as SQLite.

Python


from ra_law_db import get_law_screening_database

db = get_law_screening_database()

lookup = db.lookup(cas_number="75-09-2", language="ja")
search = db.search(query="ジクロロメタン", mode="auto", limit=10)

In this repository, regulatory.sqlite3 is the real runtime artifact. CSV and JSONL files remain useful for inspection and debugging, but normal consumers read the packaged SQLite database safely through importlib.resources.

When the dataset is refreshed, downstream systems and MCP servers can update the package version instead of sending people back to manually inspect spreadsheets.

3. ra-library: Calculate, Explain, and Compare Controls

ra-library is the core calculation engine.

Its README describes it as an independent workflow built from publicly documented methodology references: public CREATE-SIMPLE design/manual documents, HSE COSHH Essentials, ECETOC TRA, and the Potts-Guy equation for dermal absorption.

The library focuses on four things.

Return verbose calculation steps with references.
Compare what-if scenarios for controls such as ventilation, duration, and PPE.
Generate prioritized risk-reduction recommendations.
Explicitly explain when Level I is impossible within the calculation logic.

Python


from ra_library import calculate_risk

result = calculate_risk(
    substances=[{"cas_number": "75-09-2", "content_percent": 100}],
    preset="lab_organic",
    conditions={
        "amount": "small",
        "ventilation": "basic",
        "work_area_size": "medium",
    },
    duration={"hours": 0.5, "days_per_week": 5},
    include_explanation=True,
    language="ja",
)

print(result.data["methodology"]["version"])
print(result.data["components"]["75-09-2"]["risk_label"])

Before writing this post, I checked the same dichloromethane example through the RA MCP tool. With a 0.5 hour/day, five days/week, basic ventilation scenario, the tool used a CREATE-SIMPLE v3.2 methodology path and returned II-A for the 8-hour TWA side and III for the short-term side, along with structured calculation steps, coefficients, and warnings.

The point is not just to say "this is hazardous."

The point is to compare, on the same calculation basis, what happens if you add local exhaust ventilation, reduce work duration, or use respiratory protection as a later control. MHLW's Q&A explains the hierarchy of risk reduction measures and why respiratory protective equipment should not be assumed at the first estimation stage. The library design follows that separation: estimate first, then compare controls.

Making It Agent-Readable

With these libraries exposed over MCP, an AI agent can handle a conversation like this:

"
We use dichloromethane for about 30 minutes per day in our work area. Is it subject to risk assessment? We currently only have general ventilation. Is that enough? If not, what should we change?

Behind the scenes, the agent can call tools in this order.

Use risk_assessment_list to screen CAS 75-09-2.
Use ra-law-db to check related legal frameworks.
Use ra-library to model the work conditions and calculate inhalation, dermal, and physical risks.
Compare local exhaust, shorter duration, and PPE scenarios through what-if analysis.
Return the result, reasoning, limitations, and official references that a human should verify.

This changes the agent from a free-form safety advice generator into a client for structured data and structured calculations. Human review is still required, but the system can reduce missed target substances, spreadsheet transcription errors, and forgotten legal checks.

Handling Official Data

This is the part I want to be very clear about.

These repositories are not claiming that no CREATE-SIMPLE-derived data is distributed. The opposite is true: to make CREATE-SIMPLE-compatible calculations callable from code, the package distributes reference data, coefficients, and decision tables as SQLite and Python package data.

They mainly distribute these artifacts:

calculation logic reconstructed from public design/manual documents and public CREATE-SIMPLE information
SQLite databases containing coefficients, decision tables, and reference values needed for CREATE-SIMPLE-compatible calculations
SQLite databases built from public MHLW, JOHAS, NITE, and related data sources

What they do not ship is the official MHLW CREATE-SIMPLE Excel workbook itself, its macros, UI, sheet structure, or a modified derivative workbook. The ra-library README explicitly states that packaged reference data is SQLite, not original workbook files, and that the library is not an official MHLW distribution.

So the distinction is not "no CREATE-SIMPLE-derived calculation data is distributed." The distinction is: this is not distributed as the official Excel file or a modified copy of it. That wording matters, because otherwise the article sounds safer and less direct than the actual implementation.

License and Public Intent

ra-law-db and ra-library currently declare MIT license metadata in their pyproject.toml files. risk_assessment_list is being built with the same practical intent: to keep the screening layer easy to embed in internal systems, SaaS products, MCP servers, and research workflows.

I do not plan to charge for this foundation itself.

There is real value in paid consulting, expert review, and managed chemical safety systems. I am not arguing against any of that. But if the minimal steps of "is this substance in scope," "which laws may apply," and "what does the semi-quantitative risk look like" are locked entirely behind paid systems, the workplaces that most need help may be the least able to access it.

The entry point for protecting workers should be as low-friction as possible. That is the reason for publishing this.

Legal Risk, and Why I Published Anyway

There is some risk in publishing this kind of library.

CREATE-SIMPLE is a public support tool. Its design criteria and manuals are public. Still, how an independent Python implementation combined with SQLite and MCP is interpreted by agencies or related organizations is ultimately not something I control.

I am publishing it for one reason:

I want workers to be protected.

I want to work in places where the substances I handle are assessed properly and the necessary controls are in place. That is not a special demand. It is a normal expectation. If I want that for myself, I also want it for people in other companies, industries, and regions.

If a piece of the problem can be solved in code, it should be solved in code. Put it somewhere people can use for free. Make it callable from agents. That reduces, at least a little, the gap between "the system exists" and "a small workplace can actually implement it."

If MHLW or a related public institution reviews the article or repositories and decides that publication should be withdrawn, please contact me through GitHub Issues or the contact information in my profile. If there is an official request, I will respond quickly. The intent is not conflict; it is to make tools I believe are needed in the field available to the people who need them.

Closing

Chemical risk assessment is not only something workplaces are required to do. It is something that can make people safer when it is actually done.

With Python libraries, SQLite, and MCP, the workflow can become much thinner:

Extract ingredients and CAS numbers from an SDS.
Check whether the substance is in the risk-assessment target set.
Screen related laws.
Estimate risk from work conditions.
Compare risk reduction controls through what-if analysis.
Return the result with limitations and official references for human review.

Autonomous management should not mean "every employer is left alone." It should mean employers have tools, data, and implementations they can verify and adapt.

The repositories are here:

Ameyanagi/risk_assessment_list: risk-assessment target and GHS notice screening
Ameyanagi/ra-law-db: Japanese chemical law dataset
Ameyanagi/ra-library: risk calculation, explanation, and recommendation engine
Ameyanagi/ra-mcp: MCP server for ra-library
Ameyanagi/ra-law-mcp: MCP server for ra-law-db

The portfolio project list also has dedicated entries for RA Library, ra-law-db, and risk_assessment_list.

Issues, pull requests, and real use cases are welcome. If code can increase the chance that workers go home healthy, I want that code to be in the open.