Using Azure DevOps Wiki Content for AI Chatbots

By Anatoly Mironov

April 30, 2025

Imagine you want to use the content from all pages in a wiki from an Azure DevOps project, perhaps for an AI chatbot or another application. In this post, I’ll share a simple Python script to load an Azure DevOps (AzDo) wiki into memory, including the markdown content and the real URLs of the pages.

Script Explanation

Here’s the script:

	# This is a simple python script
	# that loads all the pages through Azure DevOps APIs
	# including their content (in markdown) and remote urls
	# it can then be used in an AI solution

	from azure.identity import DefaultAzureCredential
	import requests
	import json

	# run az login first
	credential = DefaultAzureCredential()
	# Get the bearer token
	azdo_scope = "499b84ac-1321-427f-aa17-267ca6975798/.default"
	token = credential.get_token(azdo_scope).token

	headers = {"Authorization": f"Bearer {token}"}

	# set the variables
	organization = "tolle"
	project = "my-project"
	wiki_name = "my-knowledge"

	base_url = f"https://dev.azure.com/{organization}/{project}"
	pages_base_url = f"{base_url}/_apis/wiki/wikis/{wiki_name}/pages"


	# Recursive function to flatten the JSON structure
	def flatten_pages(pages, result=None):
	if result is None:
	result = []

	for page in pages:
	# Add the current page's path and remoteUrl to the result
	if page.get("path") != "/":
	print(f"Page path: {page.get('path')}")
	result.append(
	{
	"path": page.get("path"),
	"remoteUrl": page.get("remoteUrl"),
	}
	)

	# If there are subPages, recursively process them
	if "subPages" in page and page["subPages"]:
	flatten_pages(page["subPages"], result)

	return result


	url = f"{pages_base_url}?path=/&recursionLevel=full&includeContent=True&api-version=7.1"
	response = requests.get(url, headers=headers)
	response.raise_for_status()
	wiki_pages = response.json()
	flat_pages = flatten_pages([wiki_pages])

	# Initialize an array to store the new objects
	pages_with_content = []

	# Iterate through each page in the flattened list
	for page in flat_pages:
	# Get the content of the page using the Azure DevOps API
	page_path = page["path"]
	page_url = f"{pages_base_url}?path={page_path}&includeContent=True&api-version=7.1"

	response = requests.get(page_url, headers=headers)
	p = response.json()
	# Check if the request was successful
	if response.status_code == 200:
	content = p.get("content")
	if content:
	pages_with_content.append(
	{
	"content": content,
	"path": page["path"],
	"remoteUrl": page["remoteUrl"],
	}
	)
	else:
	print(
	f"Failed to fetch content for {page['remoteUrl']}: {response.status_code} - {response.text}"
	)


	# Print the resulting array
	print(json.dumps(pages_with_content, indent=2))

view raw load_azdo_wiki_content.py hosted with ❤ by GitHub

Limitations

This is just a simple example and not suitable for production use. Here are some limitations:

It lacks error handling,
it does not read the images
it does not consider the order and page/subpage relationships - which might be important for understanding the content better.

Advantages

Despite its limitations, the script has some advantages:

It uses DefaultAzureCredential, making it easier to work with and preparing it for running in the cloud (e.g., using managed identies).

Resources

For more information, check out these resources::

azure-devops, a thin wrapper around the Azure DevOps REST APi. I discovered it after I started looking at the APIs. In my opinion, what I want to achieve, is better served by calling the APIs directly, which reduces the risk of potential errors.
wikis - Azure DevOps REST API Reference, the actual api reference for wikis.

Conclusion

I hope you find this script useful for loading Azure DevOps wiki content into memory. Feel free to modify and expand it to suit your needs. Happy coding!