ResearchProject/FESurrogateModelTutorial/docs/ARCHITECTURE.md

# 아키텍처

## 개요
이 프로젝트는 문서와 Jupyter notebook이 중심인 교육용 Python 프로젝트다. 재사용 가능한 계산 로직은 `src/femsurrogate/`에 모듈화하고, notebook은 설명, 시각화, 실행 순서, 결과 해석을 담당한다.

## 기술 스택
- Python `>=3.12,<3.15`
- `uv` + `pyproject.toml` + `uv.lock`
- NumPy, SciPy
- pandas, matplotlib
- scikit-learn
- JupyterLab, ipykernel, nbconvert
- pytest, ruff
- joblib, CSV, JSON

## 디렉토리 구조
```text
docs/
  PRD.md
  ARCHITECTURE.md
  ADR.md
  theory/
    00_surrogate_modeling_for_fem.md
    01_doe_sampling_validation.md
    02_response_surface_methodology.md
    03_gaussian_process_kriging.md
    04_random_forest.md
    05_gradient_boosting.md
    06_mlp_neural_network.md

BeamExamples/
  CantileverBeam.txt
  CantileverBeam_Displacements.txt

notebooks/
  00_beam2d_fea_dataset.ipynb
  01_response_surface_surrogate.ipynb
  02_gaussian_process_kriging_surrogate.ipynb
  03_random_forest_surrogate.ipynb
  04_gradient_boosting_surrogate.ipynb
  05_mlp_surrogate.ipynb
  06_compare_surrogate_models.ipynb

src/femsurrogate/
  fea/
    element.py
    model.py
    io.py
    assembly.py
    solver.py
    responses.py
    benchmark.py
  data/
    bounds.py
    sampling.py
    dataset.py
    schema.py
  surrogates/
    common.py
    rsm.py
    gpr.py
    random_forest.py
    boosting.py
    mlp.py
    registry.py
  plotting/
    diagnostics.py
    comparison.py

tests/
data/
  reference/
  processed/
reports/
  results/
  predictions/
  figures/
```

## 핵심 모듈 책임
### `femsurrogate.fea`
2D Euler-Bernoulli beam/frame 요소 기반 선형 정적 해석을 담당한다.

- `element.py`: local 6x6 stiffness matrix와 좌표 변환 행렬.
- `model.py`: node, element, support, load, material, section, parameter dataclass.
- `io.py`: `BeamExamples/*.txt`의 간단한 텍스트 fixture parser.
- `assembly.py`: global sparse stiffness matrix 조립.
- `solver.py`: 경계조건 적용과 `K u = f` 풀이.
- `responses.py`: tip displacement, bending stress, mass, compliance 계산.
- `benchmark.py`: cantilever analytical solution 등 검증 helper.

### `femsurrogate.data`
입력 공간 정의, 샘플링, batch 해석, dataset schema를 담당한다.

- `bounds.py`: 설계변수 범위와 단위.
- `sampling.py`: Latin Hypercube Sampling.
- `dataset.py`: 샘플별 FEM 실행과 CSV/metadata 저장.
- `schema.py`: 컬럼명, target명, 단위, split seed.

### `femsurrogate.surrogates`
scikit-learn 기반 모델 생성, 학습, 평가를 담당한다.

- `common.py`: train/test split, scaling, metric, timing, JSON 저장.
- `rsm.py`: `PolynomialFeatures` + `Ridge`.
- `gpr.py`: `GaussianProcessRegressor`.
- `random_forest.py`: `RandomForestRegressor`.
- `boosting.py`: `GradientBoostingRegressor`.
- `mlp.py`: `MLPRegressor`.
- `registry.py`: notebook에서 모델명을 통해 builder를 가져오는 작은 registry.

### `femsurrogate.plotting`
모델 진단과 최종 비교 그림을 담당한다.

- `diagnostics.py`: parity plot, residual plot, error histogram.
- `comparison.py`: 모델별 metric table, bar plot, prediction-time comparison.

## 주요 인터페이스
```python
def run_beam2d_case(params: BeamParameters) -> AnalysisResult:
    ...

def generate_lhs_samples(bounds: ParameterBounds, n: int, seed: int) -> pd.DataFrame:
    ...

def build_dataset(samples: pd.DataFrame) -> pd.DataFrame:
    ...

def make_model(model_name: str, random_state: int):
    ...

def evaluate_model(model, X_train, X_test, y_train, y_test) -> MetricsReport:
    ...
```

## 데이터 흐름
```text
Parameter bounds
  -> Latin Hypercube samples
  -> Beam2D FEM batch analysis
  -> data/reference/beam2d_lhs_300.csv
  -> model-specific notebooks
  -> reports/results/<model>_metrics.json
  -> reports/predictions/<model>_predictions.csv
  -> notebooks/06_compare_surrogate_models.ipynb
```

## FEM 해석 설계
- 요소: 2-node 2D Euler-Bernoulli frame element.
- 노드 DOF: `[ux, uy, rz]`.
- Local stiffness matrix: axial `EA/L` 항과 bending `EI` 항을 포함한 6x6 matrix.
- Global assembly: sparse matrix 기반.
- Solver: constrained DOF 제거 후 `scipy.sparse.linalg.spsolve`.
- 기준 검증: cantilever tip displacement `P L^3 / (3 E I)`.

## Solver Verification Fixture
`BeamExamples/`의 cantilever 예제는 solver 구현의 canonical regression fixture다.

```text
BeamExamples/CantileverBeam.txt
BeamExamples/CantileverBeam_Displacements.txt
```

입력 파일은 다음 항목을 포함한다.

- Section/material metadata: `Area`, `J`, `Iyy`, `Izz`, `ElasticModulus`, `Poisson'sRatio`.
- Geometry: `Node, NodeID, X, Y`.
- Connectivity: `Beam, BeamID, NodeID1, NodeID2`.
- Boundary condition: `Fix, NodeID`.
- Load: `NodeLoad, NodeID, Fx, Fy, Mz`.

2D in-plane Euler-Bernoulli frame solver는 `Area`, `Izz`, `ElasticModulus`, node coordinates, beam connectivity, fixed nodes, nodal loads를 사용한다. `J`, `Iyy`, `Poisson'sRatio`는 fixture metadata로 보존하되 v1 해석에는 사용하지 않는다.

기준 변위 파일은 다음 형식을 가진다.

```text
# NodeID, Ux, Uy, Rz
1  0.000000, 0.000000, 0.000000
...
```

검증 test는 solver 결과의 모든 node별 `[Ux, Uy, Rz]`를 기준 파일과 비교한다. 기준 파일 값은 소수점 6자리로 반올림되어 있으므로 기본 허용오차는 `atol=5e-7`, `rtol=1e-6`로 둔다. Tip displacement는 별도로 해석해 `P L^3 / (3 E I)`와 비교해 부호와 크기를 확인한다.

## Dataset Schema
기본 dataset은 SI 단위 컬럼을 사용한다.

```text
L_m
b_m
h_m
E_pa
P_n
A_m2
I_m4
tip_uy_m
max_abs_bending_stress_pa
mass_kg
compliance_j
```

Metadata JSON에는 다음을 포함한다.

```text
dataset_name
created_by
sample_count
random_seed
parameter_bounds
target_columns
unit_system
fea_model
notes
```

## Notebook 책임
- Notebook은 한 번에 위에서 아래로 실행 가능해야 한다.
- Notebook 셀에 핵심 FEM 또는 ML helper 구현을 길게 두지 않는다.
- 각 모델별 notebook은 같은 dataset, target, split seed를 사용한다.
- 각 모델별 notebook은 metric JSON과 prediction CSV를 저장한다.
- 최종 비교 notebook은 이전 notebook 결과물을 읽어 비교만 수행한다.

## 검증 전략
- Unit tests: FEM element, solver, dataset generation, model factory.
- Regression tests: `BeamExamples/CantileverBeam.txt`를 해석하고 `BeamExamples/CantileverBeam_Displacements.txt`와 비교.
- Integration tests: 작은 sample count로 dataset 생성과 모든 surrogate smoke test.
- Notebook tests: `nbconvert --execute`로 순차 실행.
- Documentation checks: 문서의 경로와 notebook/file 이름이 실제 구조와 일치하는지 검토.