Graph Analysis

Content Chimera can build a link graph of your crawled website — a map of how every page connects to every other page through links. This unlocks a category of analysis that isn’t possible by looking at pages individually: you can find broken links, trace redirect chains, identify orphan pages, and see which sections of a site are tightly or loosely connected.

What the Graph Tells You

Once the graph database is loaded, you can answer questions like:

  • Broken links: Which pages link to URLs that return 404 or 500 errors?

  • Redirect chains: Which pages go through multiple redirects before reaching their destination?

  • Orphan pages: Which pages have no internal links pointing to them?

  • Link depth: How many clicks does it take to reach a page from the homepage?

  • Cross-section linking: How do different folders or content areas link to each other?

  • Status code distribution: What’s the full picture of HTTP responses across all discovered URLs (including errors)?

Note

The graph database includes all URLs discovered during the crawl — including error pages (4xx, 5xx) that don’t appear in the regular flattened table. This makes it the best tool for understanding crawl health.

Loading the Graph

Before you can run graph queries, the graph database needs to be built from your crawl data. This is a separate step because it requires additional processing.

The graph loads automatically as part of the standard crawl pipeline. If you need to reload it separately — for example, after adding custom node properties — use MCP.

Note

Neither the web UI nor Chimera Chat can currently load the graph database on demand. If Chimera Chat tells you the graph isn’t available, use MCP to load it.

MCP

Ask your AI assistant to load the graph database for your site. For example:

“Load the graph database for this site”

You can also ask it to include extra data fields as node properties, so they are available when querying and visualizing the graph:

“Load the graph database and include Content Type and Word Count as node properties”

Tool reference

Tool: run-focused-pipeline

pipeline: "load_graph_db"
extent_id: 1234
neo4j_node_attributes: ["Content Type", "Word Count"]

Asking Questions About the Graph

The graph supports natural language questions. Behind the scenes, your question is translated into a graph query language and run against the link map.

Chimera Chat

Ask questions naturally. Chimera Chat will automatically route graph-related questions to the graph database. Examples:

  • “How many broken links are there?”

  • “Show me pages that redirect more than twice”

  • “Which folders have the most internal links?”

  • “Are there any orphan pages with no inbound links?”

  • “What’s the average crawl depth?”

MCP

Ask your AI assistant the same kinds of questions you would in Chimera Chat. It will automatically query the graph database. Examples:

“How many pages return a 404 status?”

“Which folders have the most internal links?”

“Are there any orphan pages?”

Tool reference

Tool: chimera-query with broad_type: "graph_db"

Custom Node Properties

By default, graph nodes carry basic crawl metadata: URL, status code, crawl depth, response type, and folder structure. You can enrich the graph by adding fields from your flattened table as custom node properties — for example, adding “Content Type” or “Word Count” lets you filter and color network charts by those dimensions.

MCP

Ask your AI assistant to reload the graph with additional fields:

“Reload the graph database and include Content Type, Word Count, and Last Modified as node properties”

These attributes persist — future graph reloads will use the same configuration unless you change it.

Tool reference

Tool: run-focused-pipeline

pipeline: "load_graph_db"
extent_id: 1234
neo4j_node_attributes: ["Content Type", "Word Count", "Last Modified"]

Tips for Graph Analysis

  • Start broad, then narrow. Begin with high-level questions (“How many broken links?”) before diving into specifics (“Which pages in /blog link to 404s?”).

  • Use filters on network charts. Rendering thousands of nodes is slow and hard to read. Filter to a specific folder or status code range.

  • Combine with structured queries. The graph tells you about relationships; the flattened table tells you about page attributes. Use both together — for example, find broken links via the graph, then check the content type of linking pages via a structured query.

  • Graph data includes errors. Unlike charts built from the flattened table (which only shows 2xx/3xx pages), the graph contains all discovered URLs. This is important when assessing crawl health.