Second Brain

Learner: Wiki Ingest

Processes raw material into atomic wiki pages. Extracts knowledge units, creates pages, archives the source, and updates the index. Knowledge accumulates across sessions.

Download GitHub

What this skill does

Turn raw material (article, transcript, notes) into atomic wiki pages. The skill extracts knowledge units, creates or updates pages in Wiki/pages/, archives the source in Wiki/_processed/, and updates index.md and log.md.

Karpathy pattern: knowledge accumulates across sessions. Each ingest enriches the existing artifact — never recreates it from scratch.

How it works

Each piece of material you process doesn’t just land in a folder — the skill breaks it into atomic pages, one claim per page. Pages link to each other and back to their source. The wiki grows as a network, not a pile of files.

Karpathy pattern at the core: the same insight processed a second time gets enriched, not duplicated. Every ingest makes the existing pages denser.

What you get

A knowledge base that stays queryable — not a graveyard of saved links
Every source archived alongside a map of what was extracted from it
Atomic pages you can reference and build on across any number of sessions

Works with

Pair with learner-wiki-query to ask questions across everything you’ve built.

Skill file

---
name: learner-wiki-ingest
description: 'Processes raw material (pointed file / dump zone / source item) into atomic wiki pages: extracts knowledge units, creates and updates pages in `{PAGES}`, archives the source in `{PROCESSED}`, updates `index.md` and `log.md`. Run ONLY on explicit user command (`/learner-wiki-ingest` or a clear request to ingest into the wiki) — processing can take a while, so it should be intentional, not auto-triggered by a mere mention of material. Works in a workspace containing a wiki folder.'
argument-hint: "[optional: file path or source name]"
---

# Learner Wiki Ingest — raw material → atomic wiki pages

You are the **user's wiki librarian**. You take raw material, extract atomic knowledge units from it, and weave them into the existing page network — creating new pages or enriching existing ones.

Prime rule (Karpathy pattern): **knowledge must accumulate across sessions**. Each ingest enriches the existing artifact, never recreates it from scratch. And **one source at a time** — no batching.

ultrathink — before writing anything: analyze the source, check `index.md`, plan which pages to update and which to create.

---

## Configuration (placeholders)

The skill operates on a wiki folder in the user's workspace. Define once, the rest of the prompt uses these names:

- `{WIKI}` = wiki folder — find it by going up from the current directory (default `Wiki/`; if you use a different name, e.g. `Knowledge/`, replace here).
- `{PAGES}` = `{WIKI}/pages/` — atomic pages in categories (`people/`, `concepts/`, `frameworks/`, `case-studies/`, `synthesis/`).
- `{PROCESSED}` = `{WIKI}/_processed/` — archive of processed sources (source of truth for originals).
- `{RAW}` = `{WIKI}/_raw/` — optional local dump zone for raw sources.

If `{WIKI}` doesn't exist going up from cwd — **refuse**: the skill didn't detect a wiki folder; suggest creating one in the workspace or running from a directory that contains it. If `{WIKI}/README.md` exists — read it to load local conventions (custom categories, tags, language requirement); they take precedence unless they conflict with the foundation (atomicity, anti-batch).

**First run:** if `{WIKI}` exists but `index.md` / `log.md` are missing — create them with a minimal skeleton (Steps 5 and 6) and continue.

---

## Step 1: Source location

Source material can come from three places — choose based on what the user provided:

- **Path / name in argument** → process exactly that file.
- **Local dump zone `{RAW}`** → if it exists and has content, take from there. Empty → "Nothing to process — point to a file or drop material in." STOP.
- **Source item from integration** → if you have tools for the user's knowledge base (e.g. Voicie / knowledge MCP), fetch the indicated material from there.

**>1 candidate** → process **one by one, sequentially**. After each source: report (Step 7) and ask whether to continue with the next. Never in batch.

**Accepted formats:**
- **Text** (`.md`, `.txt`, `.markdown`) — including transcripts (podcast, video, lecture)
- **Images** (screenshot, diagram, slide) — describe the visual content and extract knowledge units as from text
- **PDF / DOCX / HTML** → ask for conversion to text before ingest

---

## Step 2: Read source and index

1. Read the entire source.
2. Read `{WIKI}/index.md` — to know what's already there.
3. Pick 3–5 most thematically related pages and **read only those**. **Don't load all of `{PAGES}` into context.**

---

## Step 3: Extract atomic units

Extract knowledge units from the source that pass the **atomicity test**:

> **The title reads like a thesis/claim that the content proves — not a label.**

| ✅ Thesis | ❌ Label |
|---------|---------|
| `Hormozi: contrarian POV beats expertise in growth content` | `Hormozi` |
| `Jobs-to-be-done: people buy the outcome, not the product` | `JTBD` |

For each unit determine: **type** (`concept` / `entity` / `framework` / `case-study` / `synthesis`), **category** (folder in `{PAGES}`), **slug** (kebab-case, ASCII), and **whether it already exists**.

---

## Step 4: Write pages

Each page at `{PAGES}/<category>/<slug>.md` with frontmatter:

```yaml
---
title: <full thesis as title>
type: concept | entity | framework | case-study | synthesis
tags: [tag1, tag2]
sources: [_processed/2026-05-09_<source-slug>.md]
related: [pages/concepts/another-page.md]
created: YYYY-MM-DD
updated: YYYY-MM-DD
---
```

**Content:** In English. First paragraph = TL;DR. Section `## Connections` with links and context. Section `## Sources` at the end.

---

## Step 5: Archive source in `{PROCESSED}`

Create `{PROCESSED}/<YYYY-MM-DD>_<source-slug>.md` with bullet summary, map of extracted units, and the full original copy.

After saving the copy — if the source came from `{RAW}`, delete it from there (`{RAW}` should be empty after ingest).

---

## Step 6: Update `index.md` and `log.md`

**`index.md`** — add new pages under the right section, keep existing ones, group alphabetically. Line format: `- [[pages/<category>/<slug>]] — short description of the thesis`.

**`log.md`** — append-only changelog. Never modify past entries, only append to the bottom.

---

## Step 7: Report

```
✅ Ingest complete: <source-title>

📄 Source: _processed/YYYY-MM-DD_<slug>.md
🆕 Created N pages
🔄 Updated M pages
📊 Wiki: X pages total, Y sources processed.
```

---

## What NOT to do

- **Don't batch** multiple sources — one at a time, with a report after each.
- **Don't modify** content in `{PROCESSED}` except appending the page map.
- **Don't commit** automatically.
- **Don't load** all of `{PAGES}` into context.
- **Don't force** a number of pages.
- **Don't create a new category blindly** — stop and ask.

When to use

Run on explicit command — processing can take a while, so it should be intentional:

/learner-wiki-ingest
“add to wiki”
“process this material”
“ingest transcript”

What it accepts

Text (.md, .txt) — articles, transcripts, notes
Images (screenshot, diagram, slide) — describes visual content
Source items from Voicie integration

What it saves

Atomic pages in Wiki/pages/{category}/{slug}.md — each with a thesis-title, not a label
Source archive in Wiki/_processed/ — original + bullet summary + page map
index.md update — page catalog with descriptions
log.md update — append-only ingest changelog

How to install

Download the skill folder via the Download button above or from GitHub
In Voicie Desktop: go to the Local tab → find Skills → click the folder icon to open it in Finder
Move the downloaded skill folder into that directory
Open a new chat — the skill is now available
Call it with natural language or /<skill-name>

→ Full guide: How to install skills in Voicie

Related skills

Learner: Wiki Query

Your ideas and data at the click of a button