Graph Analysis ============== Content Chimera can build a **link graph** of your crawled website — a map of how every page connects to every other page through links. This unlocks a category of analysis that isn't possible by looking at pages individually: you can find broken links, trace redirect chains, identify orphan pages, and see which sections of a site are tightly or loosely connected. .. contents:: On this page :local: :depth: 2 What the Graph Tells You ------------------------ Once the graph database is loaded, you can answer questions like: - **Broken links:** Which pages link to URLs that return 404 or 500 errors? - **Redirect chains:** Which pages go through multiple redirects before reaching their destination? - **Orphan pages:** Which pages have no internal links pointing to them? - **Link depth:** How many clicks does it take to reach a page from the homepage? - **Cross-section linking:** How do different folders or content areas link to each other? - **Status code distribution:** What's the full picture of HTTP responses across all discovered URLs (including errors)? .. note:: The graph database includes **all** URLs discovered during the crawl — including error pages (4xx, 5xx) that don't appear in the regular flattened table. This makes it the best tool for understanding crawl health. Loading the Graph ----------------- Before you can run graph queries, the graph database needs to be built from your crawl data. This is a separate step because it requires additional processing. The graph loads automatically as part of the standard crawl pipeline. If you need to reload it separately — for example, after adding custom node properties — use MCP. .. note:: Neither the web UI nor Chimera Chat can currently load the graph database on demand. If Chimera Chat tells you the graph isn't available, use MCP to load it. .. admonition:: MCP :class: tip Ask your AI assistant to load the graph database for your site. For example: *"Load the graph database for this site"* You can also ask it to include extra data fields as node properties, so they are available when querying and visualizing the graph: *"Load the graph database and include Content Type and Word Count as node properties"* .. raw:: html
Tool reference

Tool: run-focused-pipeline

pipeline: "load_graph_db"
      extent_id: 1234
      neo4j_node_attributes: ["Content Type", "Word Count"]
Asking Questions About the Graph --------------------------------- The graph supports natural language questions. Behind the scenes, your question is translated into a graph query language and run against the link map. .. admonition:: Chimera Chat :class: tip Ask questions naturally. Chimera Chat will automatically route graph-related questions to the graph database. Examples: - *"How many broken links are there?"* - *"Show me pages that redirect more than twice"* - *"Which folders have the most internal links?"* - *"Are there any orphan pages with no inbound links?"* - *"What's the average crawl depth?"* .. admonition:: MCP :class: tip Ask your AI assistant the same kinds of questions you would in Chimera Chat. It will automatically query the graph database. Examples: *"How many pages return a 404 status?"* *"Which folders have the most internal links?"* *"Are there any orphan pages?"* .. raw:: html
Tool reference

Tool: chimera-query with broad_type: "graph_db"

Visualizing the Link Graph ---------------------------- The **Network** chart type renders an interactive diagram showing pages (or groups of pages) as nodes and links between them as edges. You can explore how content areas connect visually. There are two modes: **Field linkage** — Groups pages by a field (like folder) and shows how those groups link to each other. Good for understanding site structure at a high level. **Page linkage** — Shows individual pages and their direct links. Best for small subsets of the site (use filters to focus on a specific section). .. admonition:: Web UI :class: tip From the chart area on the **Assets & Metadata** page: 1. Select **Network** from the chart type pulldown 2. Choose a **Node Field** (e.g., Folder 1) for field linkage, or leave it on URL for page linkage 3. Optionally set **Node Color** to color-code by a field like Status Code 4. Use filters to focus on a manageable number of nodes .. admonition:: Chimera Chat :class: tip Ask: - *"Show me a network diagram of how folders link to each other"* - *"Create a network chart colored by content type"* .. admonition:: MCP :class: tip Ask your AI assistant to create a network diagram. You can describe what you want in plain English: *"Create a network diagram showing how folders link to each other"* *"Show me a network chart of folder connections, colored by status code"* .. raw:: html
Tool reference

Tool: chart

chart_config: {
        "chart_type": "network",
        "field_name": "folder1",
        "color_by_field": "status_code"
      }
      extent_id: 1234
Custom Node Properties ----------------------- By default, graph nodes carry basic crawl metadata: URL, status code, crawl depth, response type, and folder structure. You can enrich the graph by adding fields from your flattened table as custom node properties — for example, adding "Content Type" or "Word Count" lets you filter and color network charts by those dimensions. .. admonition:: MCP :class: tip Ask your AI assistant to reload the graph with additional fields: *"Reload the graph database and include Content Type, Word Count, and Last Modified as node properties"* These attributes persist — future graph reloads will use the same configuration unless you change it. .. raw:: html
Tool reference

Tool: run-focused-pipeline

pipeline: "load_graph_db"
      extent_id: 1234
      neo4j_node_attributes: ["Content Type", "Word Count", "Last Modified"]
Tips for Graph Analysis ----------------------- - **Start broad, then narrow.** Begin with high-level questions ("How many broken links?") before diving into specifics ("Which pages in /blog link to 404s?"). - **Use filters on network charts.** Rendering thousands of nodes is slow and hard to read. Filter to a specific folder or status code range. - **Combine with structured queries.** The graph tells you about relationships; the flattened table tells you about page attributes. Use both together — for example, find broken links via the graph, then check the content type of linking pages via a structured query. - **Graph data includes errors.** Unlike charts built from the flattened table (which only shows 2xx/3xx pages), the graph contains all discovered URLs. This is important when assessing crawl health.