fix:sample/plate 之前的开发

2026-05-28 11:56:17 +08:00
parent fc36bc83e3
commit 8b65de36b8
367 changed files with 57752 additions and 947 deletions
--- a/docs/architecture/00-overall-data-architecture.md
+++ b/docs/architecture/00-overall-data-architecture.md
@@ -0,0 +1,227 @@
+# BrAPI Test Server 总体数据架构图
+
+本文档把 4 个模块串成一张总览图：
+
+```text
+Core -> Germplasm/Seed -> Phenotyping -> Genotyping
+```
+
+对应的模块文档：
+
+| 模块 | 文档 | 核心作用 |
+| --- | --- | --- |
+| Core | `core-data-flow.md` | crop、program、trial、study、location、person 等基础上下文 |
+| Germplasm/Seed | `04-germplasm-seed-data-flow.md` | germplasm、breeding_method、seed_lot、cross、pedigree、attribute |
+| Phenotyping | `02-phenotyping-data-flow.md` | observation_unit、observation_variable、event、image、observation |
+| Genotyping | `03-genotyping-data-flow.md` | sample、plate、reference、variantset、variant、callset、allele_call |
+
+## 总体结论
+
+整个数据模型的主干是：
+
+```text
+Core: crop -> program -> trial -> study
+Germplasm: breeding_method -> germplasm -> cross / seed_lot / pedigree / attribute
+Phenotyping: study + germplasm/seed_lot/cross -> observation_unit -> observation
+Genotyping: observation_unit/study -> sample -> callset -> allele_call
+Genotyping: reference_set -> variantset -> variant -> allele_call
+```
+
+`study` 是 Core 到 Phenotyping/Genotyping 的主桥；`germplasm` 是 Germplasm/Seed 到 Phenotyping/Genotyping 的主桥；`observation_unit` 是 Phenotyping 到 Genotyping 的主桥。
+
+## 总架构图
+
+```mermaid
+flowchart TD
+    subgraph CORE["Core 基础上下文"]
+        CROP["crop<br/>作物"]
+        PERSON["person<br/>人员"]
+        PROGRAM["program<br/>项目"]
+        LOCATION["location<br/>地点"]
+        TRIAL["trial<br/>试验批次"]
+        SEASON["season<br/>季节"]
+        STUDY["study<br/>研究/试验实施单元"]
+        LIST["list / list_item<br/>通用列表"]
+
+        CROP --> PROGRAM
+        PERSON --> PROGRAM
+        PROGRAM --> TRIAL
+        CROP --> TRIAL
+        PROGRAM --> LOCATION
+        CROP --> LOCATION
+        TRIAL --> STUDY
+        PROGRAM --> STUDY
+        CROP --> STUDY
+        LOCATION --> STUDY
+        SEASON --> STUDY
+        PERSON --> LIST
+    end
+
+    subgraph GERM["Germplasm / Seed 种质与种子"]
+        BM["breeding_method<br/>育种方法"]
+        GERMPLASM["germplasm<br/>种质"]
+        GAD["germplasm_attribute_definition<br/>属性定义"]
+        GAV["germplasm_attribute_value<br/>属性值"]
+        CP["crossing_project<br/>杂交项目"]
+        CROSS["cross_entity<br/>Cross / PlannedCross"]
+        XP["cross_parent<br/>杂交亲本"]
+        PEDNODE["pedigree_node<br/>系谱节点"]
+        PEDEDGE["pedigree_edge<br/>系谱边"]
+        SEEDLOT["seed_lot<br/>种子批次"]
+        MIX["seed_lot_content_mixture<br/>批次组成"]
+        TX["seed_lot_transaction<br/>批次流转"]
+
+        BM --> GERMPLASM
+        GAD --> GAV
+        GERMPLASM --> GAV
+        CP --> CROSS
+        CROSS --> XP
+        GERMPLASM --> XP
+        CROSS --> CROSS_PLANNED["cross_entity<br/>planned cross 自关联"]
+        GERMPLASM --> PEDNODE
+        CP --> PEDNODE
+        PEDNODE --> PEDEDGE
+        PEDEDGE --> PEDNODE2["pedigree_node<br/>父本/子代节点"]
+        GERMPLASM --> MIX
+        CROSS --> MIX
+        MIX --> SEEDLOT
+        SEEDLOT --> TX
+        TX --> SEEDLOT
+    end
+
+    subgraph PHENO["Phenotyping 表型"]
+        ONTOLOGY["ontology<br/>本体"]
+        TRAIT["trait<br/>性状"]
+        METHOD["method<br/>方法"]
+        SCALE["scale<br/>标尺"]
+        OV["observation_variable<br/>观测变量"]
+        OU["observation_unit<br/>观测单元"]
+        EVENT["event<br/>事件"]
+        IMAGE["image<br/>图像"]
+        OBS["observation<br/>观测值"]
+
+        ONTOLOGY --> TRAIT
+        ONTOLOGY --> METHOD
+        ONTOLOGY --> SCALE
+        TRAIT --> OV
+        METHOD --> OV
+        SCALE --> OV
+        OU --> OBS
+        OV --> OBS
+        EVENT --> OU
+        OU --> IMAGE
+        IMAGE --> OBS
+    end
+
+    subgraph GENO["Genotyping 基因型"]
+        PLATE["plate<br/>样本板"]
+        SAMPLE["sample<br/>样本"]
+        REFSET["reference_set<br/>参考集"]
+        REF["reference<br/>参考序列"]
+        REFB["reference_bases<br/>参考片段"]
+        VARSET["variantset<br/>变异集合"]
+        VARIANT["variant<br/>变异位点"]
+        CALLSET["callset<br/>样本调用集合"]
+        CALL["allele_call<br/>基因型结果"]
+        GMAP["genome_map<br/>遗传图谱"]
+        LG["linkageGroup<br/>连锁群"]
+        MP["marker_position<br/>图谱位置"]
+
+        PLATE --> SAMPLE
+        SAMPLE --> CALLSET
+        CALLSET --> CALL
+        REFSET --> REF
+        REF --> REFB
+        REFSET --> VARSET
+        VARSET --> VARIANT
+        REFSET --> VARIANT
+        VARIANT --> CALL
+        GMAP --> LG
+        LG --> MP
+        VARIANT --> MP
+    end
+
+    CROP --> GERMPLASM
+    CROP --> GAD
+    TRAIT --> GAD
+    METHOD --> GAD
+    SCALE --> GAD
+    ONTOLOGY --> GAD
+    PROGRAM --> CP
+    PROGRAM --> SEEDLOT
+    LOCATION --> SEEDLOT
+
+    STUDY --> OU
+    TRIAL --> OU
+    PROGRAM --> OU
+    CROP --> OU
+    GERMPLASM --> OU
+    SEEDLOT --> OU
+    CROSS --> OU
+
+    STUDY --> EVENT
+    STUDY --> OBS
+    TRIAL --> OBS
+    PROGRAM --> OBS
+    CROP --> OBS
+
+    STUDY --> PLATE
+    TRIAL --> PLATE
+    PROGRAM --> PLATE
+    STUDY --> SAMPLE
+    TRIAL --> SAMPLE
+    PROGRAM --> SAMPLE
+    OU --> SAMPLE
+
+    GERMPLASM --> REFSET
+    STUDY --> VARSET
+    CROP --> GMAP
+```
+
+## 跨模块关键桥接关系
+
+| 桥接点 | 连接模块 | 说明 |
+| --- | --- | --- |
+| `crop` | Core -> Germplasm/Pheno/Geno | 作物维度贯穿 program、trial、study、germplasm、变量、图谱 |
+| `program` | Core -> Germplasm/Seed/Pheno/Geno | 项目维度连接 crossing_project、seed_lot、observation_unit、sample、plate |
+| `trial` | Core -> Pheno/Geno | 试验批次维度连接 study、observation_unit、observation、sample、plate |
+| `study` | Core -> Pheno/Geno | 最重要的实验上下文，连接 observation_unit、event、observation、sample、plate、variantset |
+| `germplasm` | Germplasm -> Pheno/Geno | 种质可连接 observation_unit、cross_parent、seed_lot_content_mixture、reference_set |
+| `seed_lot` | Germplasm/Seed -> Pheno | SeedLot 可作为 observation_unit 的材料来源 |
+| `cross_entity` | Germplasm/Seed -> Pheno | Cross/PlannedCross 可作为 observation_unit 或 seed_lot_content_mixture 的来源 |
+| `observation_unit` | Pheno -> Geno | 表型观测单元可生成或关联 genotyping sample |
+| `sample` | Geno 内部入口 | 从 observation_unit/study/trial/program 进入 callset 和 allele_call |
+| `variant` | Geno 内部位点 | 与 allele_call、marker_position 连接，承载基因型结果定位 |
+
+## 推荐整体录入顺序
+
+1. 录入 Core 基础上下文：`crop`、`person`、`program`、`location`、`trial`、`season`、`study`。
+2. 录入 Germplasm 上游：`breeding_method`、`germplasm_attribute_definition` 依赖的 `trait/method/scale/ontology`。
+3. 录入 `germplasm`，再补充 `germplasm_attribute_value`、donor、origin、institute、synonym、taxon 等扩展信息。
+4. 如果涉及杂交，录入 `crossing_project`、`cross_entity`、`cross_parent`；计划杂交使用 `cross_entity.planned` 和 `planned_cross_id` 自关联表达。
+5. 录入 Seed 数据：`seed_lot`、`seed_lot_content_mixture`、`seed_lot_transaction`。
+6. 录入 Phenotyping 定义：`ontology`、`trait`、`method`、`scale`、`observation_variable`。
+7. 录入 Phenotyping 实体与事实：`observation_unit`、`event`、`image`、`observation`。
+8. 录入 Genotyping 样本入口：`plate`、`sample`。
+9. 录入 Genotyping 参考和变异：`reference_set`、`reference`、`reference_bases`、`variantset`、`variant`。
+10. 录入 Genotyping 结果：`callset`、`callset_variant_sets`、`allele_call`。
+11. 如需遗传图谱定位，录入 `genome_map`、`linkageGroup`、`marker_position`。
+
+## 模块边界速记
+
+| 模块 | 根节点 | 主要事实表 | 向外输出 |
+| --- | --- | --- | --- |
+| Core | `crop/program/trial/study` | `study` | 给所有业务模块提供上下文 |
+| Germplasm/Seed | `germplasm` | `germplasm_attribute_value`, `seed_lot_content_mixture`, `seed_lot_transaction`, `cross_parent`, `pedigree_edge` | 给 Pheno 提供材料来源，给 Geno 提供 reference source |
+| Phenotyping | `observation_unit` | `observation` | 给 Geno 提供 sample 的观测对象来源 |
+| Genotyping | `sample`, `variant` | `allele_call` | 输出样本在位点上的 genotype 结果 |
+
+## 关键注意点
+
+1. `study` 是大多数实验数据的上下文入口；如果数据要进入 Pheno 或 Geno，通常都应该能追溯到 `study`。
+2. `germplasm` 描述种质主数据，`seed_lot` 描述库存批次；二者通过 `seed_lot_content_mixture` 间接关联。
+3. `plannedcross` 没有独立数据库表，落库在 `cross_entity`，通过 `planned` 和 `planned_cross_id` 表达。
+4. `observation_unit` 可以关联 `germplasm`、`seed_lot`、`cross_entity`，是材料进入表型观测的入口。
+5. `sample` 可以从 `observation_unit` 来，也冗余关联 `study/trial/program`，是基因型流程入口。
+6. `allele_call` 是最终 genotype 结果表，连接 `callset` 与 `variant`。
+7. `additional_info` 和 `external_references` 是跨模块通用扩展表，主图中未展开，以免遮挡主干关系。
--- a/docs/architecture/02-phenotyping-data-flow.md
+++ b/docs/architecture/02-phenotyping-data-flow.md
@@ -0,0 +1,290 @@
+# Phenotyping 模块数据流与表关系
+
+本文档分析 Phenotyping 模块的数据录入顺序、核心表关系，以及它与 Core 模块 `study/trial/program/crop` 的衔接方式。
+
+## 结论
+
+Phenotyping 模块的数据主线是：
+
+```text
+core study -> ontology -> trait / method / scale -> observation_variable -> observation_unit -> event / image -> observation
+```
+
+更贴近业务录入的顺序可以理解为：
+
+```text
+1. 先有 Core 数据：crop、program、trial、study
+2. 录入 Ontology / Trait / Method / Scale
+3. 组装 ObservationVariable
+4. 录入 ObservationUnit
+5. 录入 Event 和 Image
+6. 录入 Observation
+```
+
+初始化脚本中与 Phenotyping 相关的执行顺序是：
+
+```text
+R__init_data_14_observation_units.sql
+R__init_data_17_events.sql
+R__init_data_18_images.sql
+R__init_data_19_observation_variables.sql
+R__init_data_20_observations.sql
+R__init_data_26_observation_variables2.sql
+R__init_data_26_observation_variables3.sql
+```
+
+注意：初始化脚本为了构造演示数据，`observation_unit` 早于 `observation_variable` 插入；从业务建模角度看，二者都依赖已存在的 `study`，而真正的观测值 `observation` 需要同时引用 `observation_unit` 和 `observation_variable`。
+
+## 核心表说明
+
+| 表 | 作用 | 主要上游依赖 | 主要下游 |
+| --- | --- | --- | --- |
+| `ontology` | 本体信息，定义术语来源 | 无 | `trait`、`method`、`scale`、`observation_variable` |
+| `ontology_ref` | 本体引用项 | 可独立录入 | `trait_ontology_reference`、`method_ontology_reference`、`scale_ontology_reference` |
+| `trait` | 性状定义，描述“测什么” | 可选 `ontology` | `observation_variable` |
+| `method` | 测量方法，描述“怎么测” | 可选 `ontology` | `observation_variable` |
+| `scale` | 标尺/单位/数据类型，描述“用什么尺度表达” | 可选 `ontology` | `observation_variable`、`scale_valid_value_category` |
+| `observation_variable` | 观测变量，由 trait/method/scale 组成 | `crop`、`trait`、`method`、`scale`、`ontology` | `observation`、`study_variable` |
+| `observation_unit` | 观测对象，如 plot/plant/block | `crop`、`program`、`trial`、`study`，可选 germplasm/seed_lot/cross | `observation`、`event_observation_units`、`image` |
+| `event` | 田间事件，如施肥、灌溉、采样等 | `study` | `event_param`、`event_observation_units` |
+| `event_param` | 事件参数 | `event` | `event_parameter_entity_values_by_date` |
+| `image` | 图片/影像数据 | 可选 `observation_unit`、`geojson` | `image_observations` |
+| `observation` | 实际观测值 | `observation_unit`、`observation_variable`、`study`、可选 `season` | `image_observations` |
+
+## 建议录入顺序
+
+### 1. 准备 Core 上游数据
+
+Phenotyping 数据通常挂在 Core 的层级下面：
+
+```text
+crop -> program -> trial -> study
+```
+
+其中 `study` 是 Phenotyping 的入口节点。`observation_unit`、`event`、`observation` 都会直接或间接关联到 `study`。
+
+### 2. 录入 Ontology
+
+先录入 `ontology` 和需要的 `ontology_ref`。
+
+`ontology` 用来标识术语体系来源，后续 `trait`、`method`、`scale`、`observation_variable` 都可以挂载本体信息。
+
+### 3. 录入 Trait / Method / Scale
+
+这三类数据共同描述一个观测指标：
+
+```text
+Trait  = 测什么，例如 plant height
+Method = 怎么测，例如 ruler measurement
+Scale  = 用什么单位/数据类型表达，例如 cm、numeric
+```
+
+`scale` 如果有枚举或分类值，还会录入：
+
+```text
+scale_valid_value_category
+```
+
+### 4. 组装 ObservationVariable
+
+录入 `observation_variable`，它会引用：
+
+```text
+crop
+trait
+method
+scale
+ontology
+```
+
+业务上它相当于“可被采集的一项指标”。例如“株高-尺测法-cm”。
+
+`study_variable` 是 `study` 和 `observation_variable` 的多对多关系，表示某个 study 会采集哪些变量。
+
+### 5. 录入 ObservationUnit
+
+录入 `observation_unit`，它表示被观测对象，例如 field、block、plot、plant。
+
+它通常会引用：
+
+```text
+crop
+program
+trial
+study
+```
+
+并且可选关联：
+
+```text
+germplasm
+seed_lot
+cross
+observation_unit_position
+observation_unit_treatment
+observation_unit_level
+```
+
+### 6. 录入 Event
+
+录入 `event`，用于表达发生在 study 或 observation unit 上的事件。
+
+常见关系：
+
+```text
+event -> study
+event_observation_units -> observation_unit
+event_param -> event
+```
+
+### 7. 录入 Image
+
+录入 `image`，图片可以直接关联 `observation_unit`，也可以通过 `image_observations` 关联一个或多个 `observation`。
+
+图片坐标使用 `geojson/coordinate` 扩展。
+
+### 8. 录入 Observation
+
+最后录入 `observation`，这是 Phenotyping 模块的核心事实数据。
+
+一条 observation 通常同时引用：
+
+```text
+observation_unit
+observation_variable
+study
+trial
+program
+crop
+season
+```
+
+代码里 `ObservationEntity.setObservationUnit(...)` 会从 observation unit 继承 study/trial/program/crop，因此 observation 的上下文可以由 observation unit 自动带出。
+
+## Phenotyping 数据流图
+
+```mermaid
+flowchart TD
+    C["Core: crop"] --> P["Core: program"]
+    P --> T["Core: trial"]
+    T --> S["Core: study"]
+
+    O["ontology 本体"] --> TR["trait 性状"]
+    O --> M["method 方法"]
+    O --> SC["scale 标尺"]
+    OR["ontology_ref 本体引用"] --> TR
+    OR --> M
+    OR --> SC
+
+    C --> OV["observation_variable 观测变量"]
+    TR --> OV
+    M --> OV
+    SC --> OV
+    O --> OV
+
+    S --> SV["study_variable 研究-变量"]
+    OV --> SV
+
+    C --> OU["observation_unit 观测单元"]
+    P --> OU
+    T --> OU
+    S --> OU
+    G["Germplasm/SeedLot/Cross 可选"] --> OU
+    OU --> OUP["position / treatment / level"]
+
+    S --> E["event 事件"]
+    E --> EP["event_param 事件参数"]
+    E --> EOU["event_observation_units"]
+    OU --> EOU
+
+    OU --> IMG["image 图像"]
+    GEO["geojson / coordinate"] --> IMG
+
+    OU --> OB["observation 观测值"]
+    OV --> OB
+    S --> OB
+    T --> OB
+    P --> OB
+    C --> OB
+    SE["Core: season"] --> OB
+
+    IMG --> IO["image_observations"]
+    OB --> IO
+```
+
+## Phenotyping ER 关系图
+
+```mermaid
+erDiagram
+    crop ||--o{ observation_variable : "crop_id"
+    crop ||--o{ observation_unit : "crop_id"
+    crop ||--o{ observation : "crop_id"
+
+    program ||--o{ observation_unit : "program_id"
+    program ||--o{ observation : "program_id"
+
+    trial ||--o{ observation_unit : "trial_id"
+    trial ||--o{ observation : "trial_id"
+
+    study ||--o{ observation_unit : "study_id"
+    study ||--o{ event : "study_id"
+    study ||--o{ observation : "study_id"
+    study ||--o{ study_variable : "study_db_id"
+
+    ontology ||--o{ trait : "ontology_id"
+    ontology ||--o{ method : "ontology_id"
+    ontology ||--o{ scale : "ontology_id"
+    ontology ||--o{ observation_variable : "ontology_id"
+
+    ontology_ref ||--o{ trait_ontology_reference : "ontology_reference_id"
+    ontology_ref ||--o{ method_ontology_reference : "ontology_reference_id"
+    ontology_ref ||--o{ scale_ontology_reference : "ontology_reference_id"
+
+    trait ||--o{ observation_variable : "trait_id"
+    method ||--o{ observation_variable : "method_id"
+    scale ||--o{ observation_variable : "scale_id"
+    scale ||--o{ scale_valid_value_category : "scale_id"
+
+    observation_variable ||--o{ observation : "observation_variable_id"
+    observation_variable ||--o{ study_variable : "variable_db_id"
+
+    observation_unit ||--o{ observation : "observation_unit_id"
+    observation_unit ||--o{ observation_unit_position : "observation_unit_id"
+    observation_unit ||--o{ observation_unit_treatment : "observation_unit_id"
+    observation_unit ||--o{ observation_unit_level : "observation_unit_id"
+
+    event ||--o{ event_param : "event_id"
+    event ||--o{ event_observation_units : "event_entity_id"
+    observation_unit ||--o{ event_observation_units : "observation_units_id"
+
+    image ||--o{ image_observations : "image_entity_id"
+    observation ||--o{ image_observations : "observations_id"
+    observation_unit ||--o{ image : "observation_unit_id"
+
+    season ||--o{ observation : "season_id"
+```
+
+## API 与表的对应关系
+
+| API | 主表 | 说明 |
+| --- | --- | --- |
+| `/brapi/v2/ontologies` | `ontology` | 本体查询、新增 |
+| `/brapi/v2/traits` | `trait` | 性状定义 |
+| `/brapi/v2/methods` | `method` | 测量方法 |
+| `/brapi/v2/scales` | `scale` | 标尺、单位、数据类型 |
+| `/brapi/v2/variables` | `observation_variable` | 观测变量，由 trait/method/scale 组合 |
+| `/brapi/v2/observationunits` | `observation_unit` | 观测单元 |
+| `/brapi/v2/events` | `event` | 田间/实验事件 |
+| `/brapi/v2/images` | `image` | 图像数据 |
+| `/brapi/v2/observations` | `observation` | 实际观测值 |
+
+## 关键注意点
+
+1. `study` 是 Phenotyping 与 Core 的连接点，大多数表型数据都应挂到具体 study。
+2. `observation_variable` 不是单独的数值，它是“性状 + 方法 + 标尺 + 本体”的指标定义。
+3. `observation_unit` 是被观测对象，`observation` 是对这个对象在某个变量上的一次测量结果。
+4. `event` 可以绑定多个 `observation_unit`，适合记录施肥、灌溉、采样等动作。
+5. `image` 可以直接绑定 `observation_unit`，也可以通过 `image_observations` 与观测值关联。
+6. `trait/method/scale/observation_variable` 都有 `*_additional_info` 和 `*_external_references` 扩展表，用于补充业务字段和外部引用。
+7. `observation` 冗余保存了 `crop/program/trial/study` 上下文，代码中会从 `observation_unit` 或 `study` 向上继承这些上下文，方便查询。
+
--- a/docs/architecture/03-genotyping-data-flow.md
+++ b/docs/architecture/03-genotyping-data-flow.md
@@ -0,0 +1,313 @@
+# Genotyping 模块数据流与表关系
+
+本文档分析 Genotyping 模块的数据录入顺序、核心表关系，以及 Java 实体名与真实数据库表名之间的对应关系。
+
+## 结论
+
+Genotyping 模块的数据主线是：
+
+```text
+Core/Pheno 上游数据 -> sample / plate
+ReferenceSet -> Reference -> ReferenceBases
+ReferenceSet + Study -> VariantSet -> Variant
+Sample -> CallSet
+CallSet + Variant -> Call
+GenomeMap -> LinkageGroup -> MarkerPosition -> Variant
+```
+
+更贴近业务录入的顺序是：
+
+```text
+1. 先有 Core/Phenotyping 上游：crop、program、trial、study、observation_unit
+2. 录入 Plate 和 Sample
+3. 录入 ReferenceSet、Reference、ReferenceBases
+4. 录入 VariantSet
+5. 录入 Variant
+6. 录入 CallSet
+7. 录入 Call，也就是 allele_call 表里的基因型结果
+8. 录入 GenomeMap、LinkageGroup、MarkerPosition
+```
+
+初始化脚本中与 Genotyping 相关的执行顺序是：
+
+```text
+R__init_data_21_samples.sql
+R__init_data_22_references.sql
+R__init_data_23_variant_set_1.sql
+R__init_data_24_genome_maps.sql
+src/main/resources/db/sql/variant_set_4/variant_set_4.sql
+src/main/resources/db/sql/variant_set_4/variant_set_4_alleles.sql
+```
+
+## 实体与真实表名
+
+| 业务概念 | Java 实体 | 数据库表 | 说明 |
+| --- | --- | --- | --- |
+| Call | `CallEntity` | `allele_call` | 单个样本在某个 variant 上的 genotype 结果 |
+| CallSet | `CallSetEntity` | `callset` | 某个 sample 的一组 call，通常对应一个样本的基因型调用集合 |
+| Sample | `SampleEntity` | `sample` | 送检样本/测序样本 |
+| Plate | `PlateEntity` | `plate` | 样本板，包含多个 sample |
+| MarkerPosition | `MarkerPositionEntity` | `marker_position` | variant 在 linkage group 上的位置 |
+| Variant | `VariantEntity` | `variant` | 变异位点，如 SNP/Indel |
+| ReferenceSet | `ReferenceSetEntity` | `reference_set` | 参考基因组集合 |
+| GenomeMap | `GenomeMapEntity` | `genome_map` | 遗传图谱 |
+| VariantSet | `VariantSetEntity` | `variantset` | 一批 variant 的集合 |
+| Reference | `ReferenceEntity` | `reference` | 参考序列，如 chromosome/contig |
+| ReferenceBases | `ReferenceBasesPageEntity` | `reference_bases` | reference 的序列分页 |
+| LinkageGroup | `LinkageGroupEntity` | `linkageGroup` | 图谱中的连锁群；注意表名是驼峰 `linkageGroup` |
+
+## 核心表说明
+
+| 表 | 作用 | 主要上游依赖 | 主要下游 |
+| --- | --- | --- | --- |
+| `plate` | 样本板 | `program`、`trial`、`study`，可选 vendor submission | `sample` |
+| `sample` | 样本 | `plate`、`observation_unit`、`program`、`trial`、`study` | `callset` |
+| `reference_set` | 参考基因组集合 | 可选 `germplasm` | `reference`、`variantset`、`variant` |
+| `reference` | 参考序列 | `reference_set` | `reference_bases` |
+| `reference_bases` | 参考序列片段/分页 | `reference` | 无 |
+| `variantset` | 变异集合 | `reference_set`、`study` | `variant`、`callset_variant_sets`、`variantset_analysis`、`variantset_format` |
+| `variant` | 变异位点 | `reference_set`、`variantset` | `allele_call`、`marker_position` |
+| `callset` | 样本的 call 集合 | `sample` | `allele_call`、`callset_variant_sets` |
+| `allele_call` | genotype 调用结果 | `callset`、`variant` | 无 |
+| `genome_map` | 遗传图谱 | `crop`，可关联 `study` | `linkageGroup` |
+| `linkageGroup` | 连锁群 | `genome_map` | `marker_position` |
+| `marker_position` | marker/variant 在图谱上的位置 | `linkageGroup`、`variant` | 无 |
+
+## 建议录入顺序
+
+### 1. 准备 Core/Phenotyping 上游数据
+
+Genotyping 数据通常挂在 Core 和 Phenotyping 之上。
+
+必须或常见上游包括：
+
+```text
+crop
+program
+trial
+study
+observation_unit
+```
+
+`sample` 可以关联 `observation_unit`，也会冗余关联 `program/trial/study`，用于查询和筛选。
+
+### 2. 录入 Plate
+
+先录入 `plate`，表示样本板。
+
+`plate` 可关联：
+
+```text
+program
+trial
+study
+plate_submission
+```
+
+如果样本不走板，也可以直接录入 `sample`；但当前模型中 sample 支持挂到 plate 上。
+
+### 3. 录入 Sample
+
+录入 `sample`，它是 genotyping 流程的样本入口。
+
+主要关系：
+
+```text
+sample -> plate
+sample -> observation_unit
+sample -> program / trial / study
+sample -> germplasm_taxon
+```
+
+### 4. 录入 ReferenceSet 和 Reference
+
+录入 `reference_set`，表示参考基因组集合。
+
+然后录入 `reference`，表示具体参考序列，例如 chromosome、contig。
+
+如需保存具体序列片段，再录入：
+
+```text
+reference_bases
+```
+
+### 5. 录入 VariantSet
+
+录入 `variantset`，它把一批 variant 组织成集合。
+
+主要关系：
+
+```text
+variantset -> reference_set
+variantset -> study
+```
+
+附属表包括：
+
+```text
+variantset_analysis
+variantset_format
+variantset_additional_info
+variantset_external_references
+```
+
+### 6. 录入 Variant
+
+录入 `variant`，表示具体变异位点。
+
+主要关系：
+
+```text
+variant -> reference_set
+variant -> variantset
+```
+
+附属表包括：
+
+```text
+variant_entity_alternate_bases
+variant_entity_ciend
+variant_entity_cipos
+variant_entity_filters_failed
+```
+
+### 7. 录入 CallSet
+
+录入 `callset`，表示某个样本的一组 genotype calls。
+
+主要关系：
+
+```text
+callset -> sample
+callset_variant_sets -> variantset
+```
+
+`callset_variant_sets` 是 `callset` 和 `variantset` 的多对多关系表。
+
+### 8. 录入 Call
+
+录入 `allele_call`，业务上就是 Call。
+
+它是最终基因型调用结果，核心关系是：
+
+```text
+allele_call -> callset
+allele_call -> variant
+```
+
+也就是说，一条 call 表示“某个 sample/callset 在某个 variant 上的 genotype、read depth、likelihood 等结果”。
+
+### 9. 录入 GenomeMap 和 MarkerPosition
+
+如果需要遗传图谱定位，录入：
+
+```text
+genome_map -> linkageGroup -> marker_position -> variant
+```
+
+`marker_position` 实际上把 variant 放到某个 linkage group 的具体位置上。
+
+## Genotyping 数据流图
+
+```mermaid
+flowchart TD
+    C["Core: crop"] --> GM["genome_map 遗传图谱"]
+    C --> P["Core: program"]
+    P --> T["Core: trial"]
+    T --> ST["Core: study"]
+    ST --> PL["plate 样本板"]
+    ST --> VS["variantset 变异集合"]
+    ST --> SM["sample 样本"]
+
+    OU["Pheno: observation_unit"] --> SM
+    PL --> SM
+
+    GER["Germplasm 可选"] --> RS["reference_set 参考集合"]
+    RS --> R["reference 参考序列"]
+    R --> RB["reference_bases 参考序列分页"]
+
+    RS --> VS
+    VS --> V["variant 变异位点"]
+    RS --> V
+
+    SM --> CS["callset 样本调用集合"]
+    CS --> CSV["callset_variant_sets"]
+    VS --> CSV
+
+    CS --> CALL["allele_call / Call 基因型结果"]
+    V --> CALL
+
+    GM --> LG["linkageGroup 连锁群"]
+    LG --> MP["marker_position 图谱位置"]
+    V --> MP
+
+    VS --> VSA["variantset_analysis"]
+    VS --> VSF["variantset_format"]
+```
+
+## Genotyping ER 关系图
+
+```mermaid
+erDiagram
+    program ||--o{ plate : "program_id"
+    trial ||--o{ plate : "trial_id"
+    study ||--o{ plate : "study_id"
+
+    plate ||--o{ sample : "plate_id"
+    observation_unit ||--o{ sample : "observation_unit_id"
+    program ||--o{ sample : "program_id"
+    trial ||--o{ sample : "trial_id"
+    study ||--o{ sample : "study_id"
+
+    germplasm ||--o{ reference_set : "source_germplasm_id"
+    reference_set ||--o{ reference : "reference_set_id"
+    reference ||--o{ reference_bases : "reference_id"
+
+    reference_set ||--o{ variantset : "reference_set_id"
+    study ||--o{ variantset : "study_id"
+    variantset ||--o{ variant : "variant_set_id"
+    reference_set ||--o{ variant : "reference_set_id"
+
+    sample ||--o{ callset : "sample_id"
+    callset ||--o{ callset_variant_sets : "call_sets_id"
+    variantset ||--o{ callset_variant_sets : "variant_sets_id"
+
+    callset ||--o{ allele_call : "call_set_id"
+    variant ||--o{ allele_call : "variant_id"
+
+    crop ||--o{ genome_map : "crop_id"
+    genome_map ||--o{ linkageGroup : "genome_map_id"
+    linkageGroup ||--o{ marker_position : "linkage_group_id"
+    variant ||--o{ marker_position : "variant_id"
+
+    variantset ||--o{ variantset_analysis : "variant_set_id"
+    variantset ||--o{ variantset_format : "variant_set_id"
+```
+
+## API 与表的对应关系
+
+| API | 主表 | 说明 |
+| --- | --- | --- |
+| `/brapi/v2/samples` | `sample` | 样本查询、新增、修改 |
+| `/brapi/v2/plates` | `plate` | 样本板查询、新增、修改 |
+| `/brapi/v2/callsets` | `callset` | 样本调用集合 |
+| `/brapi/v2/calls` | `allele_call` | genotype 调用结果 |
+| `/brapi/v2/variants` | `variant` | 变异位点 |
+| `/brapi/v2/variantsets` | `variantset` | 变异集合 |
+| `/brapi/v2/referencesets` | `reference_set` | 参考基因组集合 |
+| `/brapi/v2/references` | `reference` | 参考序列 |
+| `/brapi/v2/maps` | `genome_map` | 遗传图谱 |
+| `/brapi/v2/markerpositions` | `marker_position` | variant/marker 在图谱上的位置 |
+
+## 关键注意点
+
+1. `CallEntity` 对应的数据库表不是 `call`，而是 `allele_call`。
+2. `CallSetEntity` 对应 `callset`，不是 `call_set`。
+3. `VariantSetEntity` 对应 `variantset`，不是 `variant_set`。
+4. `LinkageGroupEntity` 对应表名是 `linkageGroup`，schema 里另有外键引用时大小写需要特别注意。
+5. `sample` 是基因型流程的样本入口，向上关联 `plate/observation_unit/study/trial/program`。
+6. `variant` 是位点定义，`allele_call` 是样本在位点上的结果；不要把二者混成同一层数据。
+7. `reference_set/reference/reference_bases` 是参考基因组侧；`variantset/variant/callset/allele_call` 是变异和结果侧。
+8. `genome_map/linkageGroup/marker_position` 是遗传图谱定位侧，`marker_position` 通过 `variant_id` 与变异位点相连。
+9. 与前两篇一样，`*_additional_info` 和 `*_external_references` 是通用扩展关系，用于补充业务字段和外部引用。
+
--- a/docs/architecture/04-germplasm-seed-data-flow.md
+++ b/docs/architecture/04-germplasm-seed-data-flow.md
@@ -0,0 +1,142 @@
+# Germplasm 与 Seed 数据流及表关系
+
+本文档整理 Germplasm 模块中以 `germplasm` 为核心，并覆盖 SeedLot、CrossingProject、GermplasmAttribute、Cross、Pedigree 等相关表的关系。重点表包括：
+
+```text
+germplasm
+seed_lot
+seed_lot_content_mixture
+crossing_project
+cross_entity / planned cross / cross_parent
+germplasm_attribute_definition
+germplasm_attribute_value
+pedigree_node / pedigree_edge
+breeding_method
+```
+
+## 结论
+
+Germplasm 关系主线可以理解为：
+
+```text
+breeding_method -> germplasm -> germplasm_attribute_value -> germplasm_attribute_definition
+program -> crossing_project -> cross_entity -> cross_parent -> germplasm
+germplasm / cross_entity -> seed_lot_content_mixture -> seed_lot
+germplasm -> pedigree_node -> pedigree_edge
+```
+
+其中 `plannedcross` 在数据库中不是独立表，而是 `cross_entity` 的自关联：`cross_entity.planned_cross_id -> cross_entity.id`，并通过 `planned` 字段区分计划杂交和实际杂交。
+
+## 图 4 Germplasm 关系架构图
+
+```mermaid
+flowchart TD
+    BM["breeding_method<br/>育种方法"] -->|"breeding_method_id"| G["germplasm<br/>种质主表"]
+    CROP["crop<br/>作物"] -->|"crop_id"| G
+
+    GAD["germplasm_attribute_definition<br/>GermplasmAttribute 定义"] -->|"attribute_id"| GAV["germplasm_attribute_value<br/>GermplasmAttributeValue"]
+    G -->|"germplasm_id"| GAV
+    CROP -->|"crop_id"| GAD
+    TR["trait"] -->|"trait_id"| GAD
+    ME["method"] -->|"method_id"| GAD
+    SC["scale"] -->|"scale_id"| GAD
+    ONT["ontology"] -->|"ontology_id"| GAD
+
+    PR["program<br/>项目"] -->|"program_id"| CP["crossing_project<br/>CrossingProject"]
+    CP -->|"crossing_project_id"| CR["cross_entity<br/>Cross / PlannedCross"]
+    CP -->|"crossing_project_id"| XP["cross_parent<br/>CrossParent"]
+
+    CR -->|"cross_id"| XP
+    G -->|"germplasm_id"| XP
+    OU["observation_unit<br/>可选亲本来源"] -->|"observation_unit_id"| XP
+    CR -->|"planned_cross_id 自关联"| PCR["cross_entity<br/>planned cross"]
+
+    G -->|"germplasm_id"| SCM["seed_lot_content_mixture<br/>SeedLot 组成"]
+    CR -->|"cross_id"| SCM
+    SCM -->|"seed_lot_id"| SL["seed_lot<br/>SeedLot"]
+    PR -->|"program_id"| SL
+    LOC["location<br/>库位/地点"] -->|"location_id"| SL
+
+    SL -->|"from_seed_lot_id"| TX["seed_lot_transaction<br/>SeedLot 流转"]
+    TX -->|"to_seed_lot_id"| SL
+
+    CP -->|"crossing_project_id"| PN["pedigree_node<br/>PedigreeNode"]
+    G -->|"germplasm_id"| PN
+    PN -->|"this_node_id"| PE["pedigree_edge<br/>亲子/同胞关系"]
+    PE -->|"connceted_node_id"| PN2["pedigree_node<br/>父本/子代节点"]
+```
+
+## 图 4-2 Germplasm ER 关系图
+
+```mermaid
+erDiagram
+    crop ||--o{ germplasm : "crop_id"
+    breeding_method ||--o{ germplasm : "breeding_method_id"
+
+    germplasm ||--o{ germplasm_attribute_value : "germplasm_id"
+    germplasm ||--o{ germplasm_donor : "germplasm_id"
+    germplasm ||--o{ germplasm_institute : "germplasm_id"
+    germplasm ||--o{ germplasm_origin : "germplasm_id"
+    germplasm ||--o{ germplasm_synonym : "germplasm_id"
+    germplasm ||--o{ germplasm_taxon : "germplasm_id"
+
+    germplasm ||--o| pedigree_node : "germplasm_id"
+    pedigree_node ||--o{ pedigree_edge : "this_node_id"
+    pedigree_node ||--o{ pedigree_edge : "connceted_node_id"
+
+    crossing_project ||--o{ cross_entity : "crossing_project_id"
+    cross_entity ||--o{ cross_entity : "planned_cross_id"
+    cross_entity ||--o{ cross_parent : "cross_id"
+    crossing_project ||--o{ cross_parent : "crossing_project_id"
+    germplasm ||--o{ cross_parent : "germplasm_id"
+
+    seed_lot ||--o{ seed_lot_content_mixture : "seed_lot_id"
+    germplasm ||--o{ seed_lot_content_mixture : "germplasm_id"
+    cross_entity ||--o{ seed_lot_content_mixture : "cross_id"
+
+    location ||--o{ seed_lot : "location_id"
+    program ||--o{ seed_lot : "program_id"
+
+    seed_lot ||--o{ seed_lot_transaction : "from_seed_lot_id"
+    seed_lot ||--o{ seed_lot_transaction : "to_seed_lot_id"
+```
+
+## 核心表说明
+
+| 表 | 作用 | 主要上游依赖 | 主要下游 |
+| --- | --- | --- | --- |
+| `germplasm` | 种质主表，保存 accession、PUI、物种、采集来源、种子来源等信息 | `crop`, `breeding_method` | 属性、机构、来源、系谱、SeedLot 组成、Cross 亲本 |
+| `breeding_method` | 育种方法字典 | 无 | `germplasm` |
+| `germplasm_attribute_definition` | GermplasmAttribute 定义，继承变量定义体系，可关联 trait/method/scale/ontology/crop | `crop`, `trait`, `method`, `scale`, `ontology` | `germplasm_attribute_value` |
+| `germplasm_attribute_value` | GermplasmAttributeValue，保存某个 germplasm 在某个属性上的取值 | `germplasm`, `germplasm_attribute_definition` | 属性查询 |
+| `crossing_project` | CrossingProject，杂交项目 | `program` | `cross_entity`, `cross_parent`, `pedigree_node` |
+| `cross_entity` | Cross/PlannedCross 统一落库表；`planned_cross_id` 是对本表的自关联 | `crossing_project`, `cross_entity` | `cross_parent`, `seed_lot_content_mixture` |
+| `cross_parent` | CrossParent，连接 `cross_entity` 与 `germplasm` 或 observation unit | `cross_entity`, `crossing_project`, `germplasm`, `observation_unit` | 杂交亲本 |
+| `seed_lot` | 种子批次/库存批次，保存数量、单位、库位、项目、创建和更新时间 | `location`, `program` | `seed_lot_content_mixture`, `seed_lot_transaction` |
+| `seed_lot_content_mixture` | SeedLot 组成明细，连接 `seed_lot` 与 `germplasm` 或 `cross_entity` | `seed_lot`, `germplasm`, `cross_entity` | 表示批次内各成分占比 |
+| `seed_lot_transaction` | SeedLot 流转记录，记录从一个批次到另一个批次的数量变化 | `from_seed_lot`, `to_seed_lot` | 库存流向追踪 |
+| `pedigree_node` | 系谱节点，一个节点可关联一个 germplasm | `germplasm`, `crossing_project` | `pedigree_edge` |
+| `pedigree_edge` | 系谱边，描述 parent/child/sibling 关系 | `pedigree_node` | 系谱查询 |
+
+## 建议录入顺序
+
+1. 先录入上游基础数据：`crop`、`breeding_method`、`program`、`location`，以及属性定义需要的 `trait/method/scale/ontology`。
+2. 录入 `germplasm_attribute_definition`，定义可采集的 GermplasmAttribute。
+3. 录入 `germplasm` 主数据，并通过 `breeding_method_id` 关联育种方法。
+4. 录入 `germplasm_attribute_value`，把 germplasm 与 attribute definition 连接起来并保存具体值。
+5. 如果涉及杂交，录入 `crossing_project`，再录入计划杂交/实际杂交到 `cross_entity`；计划杂交通过 `planned=true` 或 `planned_cross_id` 自关联体现。
+6. 录入 `cross_parent`，把 cross 与 parent germplasm 或 observation unit 关联起来。
+7. 录入 `pedigree_node` 和 `pedigree_edge`，表达 germplasm 的 parent/child/sibling 系谱关系。
+8. 录入 `seed_lot`，保存批次数量、单位、库位和项目归属。
+9. 录入 `seed_lot_content_mixture`，把 seed lot 与一个或多个 `germplasm`/`cross_entity` 连接起来。
+10. 后续出入库、分装、合并或转移时，录入 `seed_lot_transaction`，通过 `from_seed_lot_id` 与 `to_seed_lot_id` 追踪流向。
+
+## 关键注意点
+
+1. `germplasm.seedSource` 和 `germplasm.seedSourceDescription` 是种质主表上的描述字段，不等同于库存批次。
+2. 真正表示库存批次的是 `seed_lot`，而批次与种质的关系在 `seed_lot_content_mixture` 中。
+3. `seed_lot_content_mixture` 可以关联 `germplasm`，也可以关联 `cross_entity`，适合表达混合种子批次或由杂交产生的批次。
+4. `seed_lot_transaction` 同时有 `fromSeedLot` 与 `toSeedLot`，因此它表达的是 seed lot 到 seed lot 的流转关系，而不是 seed lot 到 germplasm 的直接关系。
+5. `plannedcross` 没有独立数据库表，统一使用 `cross_entity`，通过 `planned` 字段和 `planned_cross_id` 自关联表达。
+6. `germplasm_attribute_definition` 是属性定义，`germplasm_attribute_value` 是种质上的实际属性值，两者通过 `attribute_id` 连接。
+7. 系谱关系由 `pedigree_node` 和 `pedigree_edge` 表达；杂交流程由 `cross_entity` 和 `cross_parent` 表达，两者都可以回到 `germplasm` 主数据。
--- a/docs/architecture/core-data-flow.md
+++ b/docs/architecture/core-data-flow.md
@@ -0,0 +1,210 @@
+# Core 模块数据流与表关系
+
+本文档分析 `brapi-java` 项目 core 模块的数据录入顺序、主表关系，以及初始化脚本中的实际数据流。
+
+## 结论
+
+Core 模块的数据主线是：
+
+```text
+crop -> person -> program -> location -> trial -> season -> study -> study 附属信息
+```
+
+`list` 是相对独立的列表能力，可以较早录入；如果需要绑定列表所有人，则依赖 `person`。
+
+初始化脚本实际执行顺序是：
+
+```text
+R__init_data_01_crops.sql
+R__init_data_02_lists.sql
+R__init_data_03_locations.sql
+R__init_data_04_people.sql
+R__init_data_05_programs.sql
+R__init_data_06_trials.sql
+R__init_data_07_seasons.sql
+R__init_data_08_studies.sql
+```
+
+其中 `R__init_data_05_programs.sql` 会插入 program 负责人到 `person` 表，并回填部分 `location.program_id/crop_id`。
+
+## 核心表说明
+
+| 表 | 作用 | 主要上游依赖 | 主要下游 |
+| --- | --- | --- | --- |
+| `crop` | 作物字典，Core 主根数据之一 | 无 | `program`、`location`、`trial`、`study` |
+| `person` | 人员、联系人、负责人 | 无 | `program.lead_person_id`、`trial_contact`、`study_contact`、`list.list_owner_person_id` |
+| `program` | 育种项目/业务项目 | `crop`、可选 `person` | `trial`、`study`、`location` |
+| `location` | 地点/试验点 | 可选 `crop`、`program`、父级 `location`、`geojson` | `study` |
+| `trial` | 试验批次/试验项目 | `crop`、`program` | `study`、`trial_contact`、`trial_publication`、`trial_dataset_authorship` |
+| `season` | 季节/年度区间字典 | 无 | `study_season`、部分 observation |
+| `study` | 具体研究/试验实施单元 | `crop`、`program`、`trial`、`location` | `study_contact`、`study_season`、`study_data_link`、`study_observation_level` 等 |
+| `list` | 通用列表 | 可选 `person` | `list_item` |
+| `list_item` | 列表明细项 | `list` | 无 |
+
+## 建议录入顺序
+
+### 1. 录入基础字典
+
+先录入 `crop` 和 `person`。
+
+`crop` 是作物维度根数据，后续 `program`、`trial`、`study` 都会挂到它下面。`person` 是人员基础资料，后续会作为项目负责人、试验联系人、研究联系人、列表负责人使用。
+
+### 2. 录入地点
+
+录入 `location`，如果地点有坐标，需要先录入 `geojson` 和 `coordinate`。
+
+地点可以先不绑定 `program/crop`，初始化脚本里就是先插入地点，再在 program 初始化阶段回填 `program_id` 和 `crop_id`。
+
+### 3. 录入项目 Program
+
+录入 `program` 时需要已有 `crop`。如果有负责人，需要已有 `person`。
+
+程序层面 `ProgramEntity.setCrop(...)` 直接绑定作物；后续 trial/study 设置 program 时，会同步继承 program 的 crop。
+
+### 4. 录入 Trial
+
+录入 `trial` 时需要已有 `program` 和 `crop`。
+
+`trial` 还可以同时录入：
+
+```text
+trial_contact
+trial_dataset_authorship
+trial_publication
+trial_additional_info
+trial_external_references
+```
+
+其中 `trial_contact` 是 `trial` 和 `person` 的多对多关系表。
+
+### 5. 录入 Season
+
+录入 `season`。它本身相对独立，但后续 `study` 会通过 `study_season` 关联多个 season。
+
+### 6. 录入 Study
+
+录入 `study` 时通常需要已有：
+
+```text
+crop
+program
+trial
+location
+```
+
+`study` 是 core 模块向 pheno/geno 数据扩展的关键节点。后续 observation、observation_unit、sample、plate、variantset 等很多模块都会引用 `study`。
+
+### 7. 录入 Study 附属信息
+
+录入 study 后，再录入依赖 `study_id` 的附属表：
+
+```text
+study_contact
+study_data_link
+study_environment_parameter
+study_experimental_design
+study_growth_facility
+study_last_update
+study_observation_level
+study_season
+study_variable
+study_additional_info
+study_external_references
+```
+
+## Core 数据流图
+
+```mermaid
+flowchart TD
+    A["1. crop 作物字典"] --> C["3. program 项目"]
+    B["2. person 人员"] --> C
+    B --> L["list 列表负责人，可选"]
+    L --> LI["list_item 列表项"]
+
+    G["geojson / coordinate 坐标"] --> D["2. location 地点"]
+    A --> D
+    C --> D
+    D --> E["6. study 研究"]
+
+    C --> F["4. trial 试验"]
+    A --> F
+    B --> FC["trial_contact 试验联系人"]
+    F --> FC
+    F --> FP["trial_publication / trial_dataset_authorship"]
+
+    S["5. season 季节"] --> SS["study_season 研究季节"]
+    F --> E
+    C --> E
+    A --> E
+    E --> SS
+    E --> SC["study_contact 研究联系人"]
+    B --> SC
+    E --> SA["study_data_link / environment / design / growth_facility / last_update / observation_level"]
+
+    E --> P1["pheno: observation_unit / observation"]
+    E --> G1["geno: sample / plate / variantset"]
+```
+
+## Core ER 关系图
+
+```mermaid
+erDiagram
+    crop ||--o{ program : "crop_id"
+    crop ||--o{ location : "crop_id"
+    crop ||--o{ trial : "crop_id"
+    crop ||--o{ study : "crop_id"
+
+    person ||--o{ program : "lead_person_id"
+    person ||--o{ trial_contact : "person_db_id"
+    person ||--o{ study_contact : "person_db_id"
+    person ||--o{ list : "list_owner_person_id"
+
+    program ||--o{ location : "program_id"
+    program ||--o{ trial : "program_id"
+    program ||--o{ study : "program_id"
+
+    location ||--o{ location : "parent_location_id"
+    location ||--o{ study : "location_id"
+
+    trial ||--o{ study : "trial_id"
+    trial ||--o{ trial_contact : "trial_db_id"
+    trial ||--o{ trial_publication : "trial_id"
+    trial ||--o{ trial_dataset_authorship : "trial_id"
+
+    study ||--o{ study_contact : "study_db_id"
+    study ||--o{ study_season : "study_db_id"
+    season ||--o{ study_season : "season_db_id"
+
+    study ||--o{ study_data_link : "study_id"
+    study ||--o{ study_environment_parameter : "study_id"
+    study ||--o{ study_experimental_design : "study_id"
+    study ||--o{ study_growth_facility : "study_id"
+    study ||--o{ study_last_update : "study_id"
+    study ||--o{ study_observation_level : "study_id"
+
+    list ||--o{ list_item : "list_id"
+```
+
+## API 与表的对应关系
+
+| API | 主表 | 说明 |
+| --- | --- | --- |
+| `GET /brapi/v2/commoncropnames` | `crop` | 查询作物名称列表 |
+| `GET/POST/PUT /brapi/v2/people` | `person` | 人员查询、新增、修改；无删除接口 |
+| `GET/POST/PUT /brapi/v2/programs` | `program` | 项目依赖 crop，可关联 lead person |
+| `GET/POST/PUT /brapi/v2/locations` | `location` | 地点可关联 crop、program、parent location、geojson |
+| `GET/POST/PUT /brapi/v2/trials` | `trial` | 试验依赖 program/crop，可关联 contacts/publications |
+| `GET/PUT /brapi/v2/seasons` | `season` | 季节字典 |
+| `GET/POST/PUT /brapi/v2/studies` | `study` | 研究依赖 trial/program/crop/location |
+| `GET/POST/PUT /brapi/v2/lists` | `list`、`list_item` | 列表及列表项 |
+
+## 关键注意点
+
+1. `crop` 是最重要的根字典之一，许多业务表都有 `crop_id`。
+2. `program` 是承上启下的业务节点，它依赖 `crop`，并被 `trial`、`study`、`location` 引用。
+3. `trial` 是 study 的上级试验组织，`study` 是后续表型、基因型数据的核心入口。
+4. `person` 与 `trial/study` 是多对多关系，通过 `trial_contact`、`study_contact` 连接。
+5. `study_season` 是 `study` 和 `season` 的多对多关系。
+6. `additional_info` 和 `external_reference` 是通用扩展表，core 主表通过各自的 `*_additional_info`、`*_external_references` 关联它们。
+7. 初始化脚本中 `list` 早于 `person` 插入，是因为初始 list 数据主要使用文本 owner 字段；如果业务上要设置 `list_owner_person_id`，应先有 `person`。
+