Data connectors automatically sync content from external sources into Zine (powered by Graphlit). Instead of manually uploading files, connectors continuously monitor Slack channels, Gmail inboxes, Google Drive folders, GitHub repos, and 30+ other sources—keeping your knowledge base up-to-date automatically.
This guide covers feed architecture, OAuth vs API key authentication, all connector types, polling strategies, and production patterns. By the end, you'll know how to connect any data source and build automated content pipelines.
What You'll Learn
- Feed architecture and lifecycle
- OAuth flows vs API key authentication
- Connector patterns by category (messaging, cloud storage, project management)
- Feed configuration options (readLimit, schedules, filters)
- Polling vs webhook patterns
- Production feed management
- Error handling and retry strategies
Prerequisites:
- A Graphlit project - Sign up (2 min)
- SDK installed:
npm install graphlit-client(30 sec) - OAuth apps set up for connectors you want to use (we'll show you how)
Time to complete: 80 minutes
Difficulty: Intermediate
Developer Note: All Graphlit IDs are GUIDs. Example outputs show realistic GUID format.
Table of Contents
- Feed Architecture
- Authentication Methods
- Messaging Connectors
- Cloud Storage Connectors
- Project Management Connectors
- Social Media & Web Connectors
- Feed Management
- Production Patterns
Part 1: Feed Architecture
What is a Feed?
A feed is a continuous sync between an external data source and Graphlit. Once created, it:
- Initial sync: Fetches existing content (e.g., last 100 Slack messages)
- Continuous monitoring: Polls for new content (e.g., every 15 minutes)
- Auto-ingestion: New content automatically appears in Graphlit
Key insight: Feeds are "set it and forget it"—no manual re-triggering needed.
✅ Quick Win: Once a feed is created, new content automatically appears in your search results and RAG responses—no additional code needed.
All 30+ Supported Connectors
Zine supports 30+ connector types across 6 categories:
Messaging & Collaboration (6):
- Slack - Channels, threads, DMs
- Microsoft Teams - Team channels and conversations
- Discord - Server channels
- Gmail - Email inbox (labels, folders)
- Outlook Email - Microsoft email
- Intercom - Support articles and tickets
Cloud Storage (8):
- Google Drive - Docs, Sheets, Slides, PDFs
- Microsoft OneDrive - Personal cloud storage
- SharePoint - Enterprise document management
- Dropbox - Files and folders
- Box - Enterprise file storage
- Amazon S3 - Object storage buckets
- Azure Blob Storage - Cloud file storage
- FTP/SFTP - File servers
Source Control & Development (5):
- GitHub Code - Repository contents
- GitHub Issues - Bug tracking and discussions
- GitHub Pull Requests - Code reviews
- GitHub Commits - Change history
- GitLab - Code and issues
Project Management (4):
- Jira - Issue tracking
- Linear - Modern project management
- Trello - Kanban boards
- Asana - Task management
Knowledge Management (2):
- Notion - Pages and databases
- Confluence - Wiki pages
Social Media & Web (6):
- Reddit - Posts and comments
- Twitter/X - Tweets and threads
- YouTube - Video transcripts
- RSS Feeds - Blog feeds
- Web Crawling - Website content
- Web Search - Tavily, Exa, Perplexity
Calendars & Meetings (3):
- Google Calendar - Events and meetings
- Outlook Calendar - Microsoft calendar
- Zoom - Meeting recordings (transcribed)
Customer & Sales (2):
- Zendesk - Support tickets
- Salesforce - CRM data (custom integration)
Feed Lifecycle
CREATE → ENABLED → SYNCING → INDEXED
↓
DISABLED (if paused)
↓
DELETED (if removed)
Part 2: Authentication Methods
OAuth (Recommended for Most Connectors)
OAuth lets users authorize access without sharing passwords. Graphlit manages the OAuth flow.
Connectors using OAuth:
- Slack
- Gmail / Google Drive / Google Calendar
- Microsoft (Outlook, OneDrive, SharePoint, Teams)
- GitHub
- Notion
- Jira
- Linear
OAuth flow:
- User clicks "Connect Slack"
- Redirected to Slack OAuth
- User authorizes
- Graphlit receives OAuth token
- Create feed with token
// Example: Slack OAuth
const authUrl = `https://slack.com/oauth/v2/authorize?client_id=${SLACK_CLIENT_ID}&scope=channels:read,channels:history&redirect_uri=${REDIRECT_URI}`;
// User visits authUrl, authorizes
// Slack redirects back with code
// Exchange code for token
const tokenResponse = await fetch('https://slack.com/api/oauth.v2.access', {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
body: `code=${code}&client_id=${SLACK_CLIENT_ID}&client_secret=${SLACK_CLIENT_SECRET}`
});
const { access_token } = await tokenResponse.json();
// Create feed with token
const feed = await graphlit.createFeed({
name: 'My Slack Feed',
type: FeedTypes.Slack,
slack: {
token: access_token,
channel: 'general', // Single channel per feed
readLimit: 100
}
});
API Keys (For Services Without OAuth)
Some connectors use direct API keys:
- RSS feeds (no auth)
- Web crawling (no auth)
- S3 (access key + secret)
- Azure Storage (connection string)
import { FeedTypes, FeedServiceTypes } from 'graphlit-client/dist/generated/graphql-types';
// Example: S3 feed with API keys
const s3Feed = await graphlit.createFeed({
name: 'Company S3 Bucket',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.S3Blob,
s3: {
bucketName: 'documents',
region: 'us-east-1',
accessKey: process.env.AWS_ACCESS_KEY,
secretAccessKey: process.env.AWS_SECRET_KEY,
prefix: 'pdfs/' // Optional: filter by folder
},
isRecursive: true,
readLimit: 1000
}
});
Part 3: Messaging Connectors
Slack
Use case: Search team conversations, RAG over chat history, entity extraction from messages.
import { Graphlit } from 'graphlit-client';
import { FeedServiceTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Create Slack feed
const slackFeed = await graphlit.createFeed({
name: 'Engineering Slack',
type: FeedTypes.Slack,
slack: {
token: process.env.SLACK_BOT_TOKEN,
channel: 'engineering', // Single channel
readLimit: 500 // Last 500 messages
}
});
console.log('Slack feed created:', slackFeed.createFeed.id);
// Wait for initial sync
let isDone = false;
while (!isDone) {
const status = await graphlit.isFeedDone(slackFeed.createFeed.id);
isDone = status.isFeedDone.result;
await new Promise(r => setTimeout(r, 10000)); // Check every 10s
}
console.log('✓ Slack history synced');
OAuth scopes needed:
channels:read- List channelschannels:history- Read messagesgroups:read- Private channels (optional)groups:history- Private messages (optional)
What gets synced:
- All messages in specified channels
- Threaded replies
- User mentions
- Files/images attached to messages
- Reactions (optional)
💡 Pro Tip: Combine Slack feeds with entity extraction to automatically identify who's working on which projects from Slack conversations.
Gmail
Use case: Search emails, extract contacts/companies, email-based RAG.
const gmailFeed = await graphlit.createFeed({
name: 'My Gmail',
type: FeedTypes.Email,
email: {
type: FeedServiceTypes.GoogleEmail,
google: {
refreshToken: process.env.GMAIL_OAUTH_TOKEN
},
includeAttachments: true,
readLimit: 100 // Last 100 emails
}
});
OAuth scopes needed:
https://www.googleapis.com/auth/gmail.readonly
What gets synced:
- Email subject, body, sender, recipients
- Attachments (PDFs, images, etc.)
- Timestamps
- Email threads
Microsoft Teams
const teamsFeed = await graphlit.createFeed({
name: 'Engineering Team',
type: FeedTypes.Message,
message: {
type: FeedServiceTypes.MicrosoftTeams,
microsoft: {
refreshToken: process.env.TEAMS_OAUTH_TOKEN
},
teamId: 'team-guid',
channel: 'channel-guid', // Single channel
readLimit: 100
}
});
Discord
const discordFeed = await graphlit.createFeed({
name: 'Community Discord',
type: FeedTypes.Message,
message: {
type: FeedServiceTypes.Discord,
discord: {
token: process.env.DISCORD_BOT_TOKEN,
serverId: 'guild-id'
},
channel: 'channel-id', // Single channel
readLimit: 500
}
});
Part 4: Cloud Storage Connectors
Google Drive
Use case: Sync company documents, collaborative files, shared folders.
const driveFeed = await graphlit.createFeed({
name: 'Company Drive',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.GoogleDrive,
googleDrive: {
refreshToken: process.env.GOOGLE_OAUTH_TOKEN,
folderId: 'folder-id' // Optional: sync specific folder
},
isRecursive: true,
readLimit: 1000
}
});
What gets synced:
- Google Docs (converted to markdown)
- Google Sheets (tables extracted)
- Google Slides (text extracted)
- PDFs, images, videos
- Files in subfolders
OAuth scopes needed:
https://www.googleapis.com/auth/drive.readonly
OneDrive / SharePoint
// OneDrive personal
const oneDriveFeed = await graphlit.createFeed({
name: 'My OneDrive',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.OneDrive,
oneDrive: {
refreshToken: process.env.MICROSOFT_OAUTH_TOKEN,
folderId: 'folder-id' // Optional
},
isRecursive: true,
readLimit: 500
}
});
// SharePoint (team sites)
const sharePointFeed = await graphlit.createFeed({
name: 'Company SharePoint',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.SharePoint,
sharePoint: {
refreshToken: process.env.MICROSOFT_OAUTH_TOKEN,
siteId: 'site-id',
driveId: 'drive-id'
},
isRecursive: true,
readLimit: 1000
}
});
GitHub
Use case: Sync code repos, documentation, READMEs.
const githubFeed = await graphlit.createFeed({
name: 'Company Repo',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.GitHub,
github: {
personalAccessToken: process.env.GITHUB_PAT,
repositoryOwner: 'my-company',
repositoryName: 'main-repo'
},
isRecursive: true
}
});
What gets synced:
- Source code files
- README.md files
- Documentation
- Commit messages (optional)
Amazon S3
const s3Feed = await graphlit.createFeed({
name: 'Documents S3 Bucket',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.S3Blob,
s3: {
bucketName: 'company-documents',
region: 'us-east-1',
accessKey: process.env.AWS_ACCESS_KEY,
secretAccessKey: process.env.AWS_SECRET_KEY,
prefix: 'public/' // Optional: sync specific folder
},
isRecursive: true
}
});
Part 5: Project Management Connectors
Jira
Use case: Search issues, track project status, entity extraction from tickets.
const jiraFeed = await graphlit.createFeed({
name: 'Engineering Jira',
type: FeedTypes.Issue,
issue: {
type: FeedServiceTypes.AtlassianJira,
jira: {
email: 'user@company.com',
token: process.env.JIRA_API_TOKEN,
uri: 'https://yourcompany.atlassian.net',
project: 'PROJ' // Project key
}
},
readLimit: 500
});
What gets synced:
- Issue title, description, comments
- Status, assignee, reporter
- Attachments
- Custom fields
Linear
const linearFeed = await graphlit.createFeed({
name: 'Product Linear',
type: FeedTypes.Issue,
issue: {
type: FeedServiceTypes.Linear,
linear: {
token: process.env.LINEAR_API_KEY,
teamId: 'team-id'
}
},
readLimit: 500
});
Notion
const notionFeed = await graphlit.createFeed({
name: 'Company Wiki',
type: FeedTypes.Notion,
notion: {
token: process.env.NOTION_INTEGRATION_TOKEN
},
readLimit: 1000
});
What gets synced:
- Pages and sub-pages
- Databases and records
- Embedded content
- Inline comments
GitHub Issues & Pull Requests
// Issues
const issuesFeed = await graphlit.createFeed({
name: 'Repo Issues',
type: FeedTypes.Issue,
issue: {
type: FeedServiceTypes.GitHub,
github: {
personalAccessToken: process.env.GITHUB_PAT,
repositoryOwner: 'my-company',
repositoryName: 'main-repo',
includeIssues: true
}
},
readLimit: 500
});
// Pull Requests
const prFeed = await graphlit.createFeed({
name: 'Repo PRs',
type: FeedTypes.PullRequest,
pullRequest: {
type: FeedServiceTypes.GitHub,
github: {
personalAccessToken: process.env.GITHUB_PAT,
repositoryOwner: 'my-company',
repositoryName: 'main-repo'
}
},
readLimit: 100
});
Part 6: Social Media & Web Connectors
const redditFeed = await graphlit.createFeed({
name: 'Tech Subreddit',
type: FeedTypes.Reddit,
reddit: {
subredditName: 'MachineLearning'
},
readLimit: 100
});
RSS Feeds
const rssFeed = await graphlit.createFeed({
name: 'Tech News RSS',
type: FeedTypes.Rss,
rss: {
uri: 'https://techcrunch.com/feed/'
},
readLimit: 50
});
Web Crawling
Use case: Scrape documentation sites, competitor analysis, content aggregation.
const webCrawl = await graphlit.createFeed({
name: 'Documentation Crawler',
type: FeedTypes.Web,
web: {
uri: 'https://docs.example.com',
allowedPaths: ['^https://docs\\.example\\.com/.*'], // Regex patterns
excludedPaths: ['/api/.*', '/archive/.*']
},
readLimit: 500
});
What gets scraped:
- Page HTML (converted to markdown)
- Links (follows to crawl more pages)
- Images (optional)
- Metadata (title, description)
YouTube
const youtubeFeed = await graphlit.createFeed({
name: 'Channel Videos',
type: FeedTypes.YouTube,
youtube: {
channelIdentifier: 'channel-id'
},
readLimit: 50
});
What gets synced:
- Video transcripts (auto-generated or manual)
- Titles, descriptions
- Thumbnails
- Comments (optional)
Part 7: Feed Management
Query Feeds
// Get all feeds
const feeds = await graphlit.queryFeeds();
feeds.feeds.results.forEach(feed => {
console.log(`${feed.name} (${feed.type})`);
console.log(` State: ${feed.state}`);
console.log(` Last sync: ${feed.lastSyncDateTime}`);
});
Update Feed
// Change feed configuration
await graphlit.updateFeed(feedId, {
name: 'Updated Name',
slack: {
readLimit: 1000 // Increase sync limit
}
});
Disable/Enable Feed
// Pause syncing
await graphlit.disableFeed(feedId);
// Resume syncing
await graphlit.enableFeed(feedId);
Delete Feed
// Delete feed (and optionally its content)
await graphlit.deleteFeed(feedId);
// Delete feed but keep synced content
await graphlit.deleteFeed(feedId, false);
Trigger Manual Sync
// Force immediate sync (useful for testing)
await graphlit.triggerFeedSync(feedId);
// Wait for sync to complete
let isDone = false;
while (!isDone) {
const status = await graphlit.isFeedDone(feedId);
isDone = status.isFeedDone.result;
await new Promise(r => setTimeout(r, 5000));
}
Part 8: Advanced Patterns
Pattern 1: Feed with Workflow
Apply processing to synced content:
// Create workflow first
const workflow = await graphlit.createWorkflow({
name: "Extract Entities",
extraction: { /* ... */ }
});
// Create feed with workflow
const feed = await graphlit.createFeed({
name: 'Slack with Entities',
type: FeedTypes.Slack,
slack: {
token: process.env.SLACK_BOT_TOKEN,
channel: 'general'
},
workflow: { id: workflow.createWorkflow.id }
});
// All synced messages will have entities extracted
Pattern 2: Feed with Collections
Auto-organize synced content:
// Create collection
const collection = await graphlit.createCollection('Slack Messages');
// Create feed that adds to collection
const feed = await graphlit.createFeed({
name: 'Slack Feed',
type: FeedTypes.Slack,
slack: {
token: process.env.SLACK_BOT_TOKEN,
channel: 'general'
},
collections: [{ id: collection.createCollection.id }]
});
Pattern 3: Multi-Feed Strategy
Sync from multiple sources into unified knowledge base:
// Feed 1: Slack
const slackFeed = await graphlit.createFeed({
name: 'Slack',
type: FeedTypes.Slack,
slack: {
token: process.env.SLACK_BOT_TOKEN,
channel: 'general'
}
});
// Feed 2: Gmail
const gmailFeed = await graphlit.createFeed({
name: 'Gmail',
type: FeedTypes.Email,
email: {
type: FeedServiceTypes.GoogleEmail,
google: {
refreshToken: process.env.GMAIL_OAUTH_TOKEN
},
readLimit: 100
}
});
// Feed 3: Google Drive
const driveFeed = await graphlit.createFeed({
name: 'Drive',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.GoogleDrive,
googleDrive: {
refreshToken: process.env.GOOGLE_OAUTH_TOKEN
},
isRecursive: true
}
});
// Now search across all sources
const results = await graphlit.queryContents({
search: "project update"
});
// Returns results from Slack, Gmail, AND Drive
Pattern 4: Scheduled Feeds
Control sync frequency:
const feed = await graphlit.createFeed({
name: 'Daily News Feed',
type: FeedTypes.Rss,
rss: {
uri: 'https://news.com/feed'
},
readLimit: 50,
schedulePolicy: {
recurrenceType: TimedPolicyRecurrenceTypes.Repeat,
repeatInterval: 'P1D' // ISO 8601: 1 day
}
});
Part 9: Production Patterns
Pattern 1: OAuth Token Refresh
OAuth tokens expire—handle refresh:
// Store refresh token when user authorizes
const oauthData = {
accessToken: '...',
refreshToken: '...',
expiresAt: Date.now() + 3600000
};
// Before creating feed, check if token is expired
async function getValidToken() {
if (Date.now() > oauthData.expiresAt) {
// Refresh token
const newTokens = await refreshOAuthToken(oauthData.refreshToken);
oauthData.accessToken = newTokens.accessToken;
oauthData.expiresAt = Date.now() + 3600000;
}
return oauthData.accessToken;
}
// Use refreshed token
const token = await getValidToken();
const feed = await graphlit.createFeed({
name: 'Slack Feed',
type: FeedTypes.Slack,
slack: {
token,
channel: 'general'
}
});
Pattern 2: Feed Health Monitoring
Monitor feed status:
// Check all feeds
const feeds = await graphlit.queryFeeds();
feeds.feeds.results.forEach(feed => {
if (feed.state === 'FAILED') {
console.error(`Feed ${feed.name} failed`);
// Alert ops team
}
if (feed.lastSyncDateTime) {
const hoursSinceSync = (Date.now() - new Date(feed.lastSyncDateTime).getTime()) / 3600000;
if (hoursSinceSync > 24) {
console.warn(`Feed ${feed.name} hasn't synced in ${hoursSinceSync}h`);
}
}
});
Pattern 3: Rate Limiting
Avoid overwhelming external APIs:
// Create feeds with delays
const urls = ['url1', 'url2', 'url3'];
for (const url of urls) {
const feed = await graphlit.createFeed({
name: `RSS Feed ${url}`,
type: FeedTypes.Rss,
rss: { uri: url }
});
// Wait 5 seconds between feed creations
await new Promise(r => setTimeout(r, 5000));
}
Common Issues & Solutions
Issue: OAuth Token Invalid
Problem: "Invalid token" error when creating feed.
Solution: Refresh OAuth token or re-authorize:
try {
const feed = await graphlit.createFeed(config);
} catch (error: any) {
if (error.message.includes('invalid token')) {
// Redirect user to re-authorize
window.location.href = getOAuthUrl();
}
}
Issue: Feed Not Syncing
Problem: Feed created but no content appears.
Solutions:
- Check feed state:
const feed = await graphlit.getFeed(feedId);
console.log('State:', feed.feed.state);
- Wait for initial sync:
await waitForFeedCompletion(feedId);
- Trigger manual sync:
await graphlit.triggerFeedSync(feedId);
Issue: Too Much Content
Problem: Feed syncs thousands of items, overwhelming system.
Solution: Use readLimit:
const feed = await graphlit.createFeed({
name: 'Limited Slack Feed',
type: FeedTypes.Slack,
slack: {
token: process.env.SLACK_BOT_TOKEN,
channel: 'general'
},
readLimit: 100 // Only last 100 messages
});
What's Next?
You now understand data connectors completely. Next steps:
- Set up OAuth apps for connectors you need
- Create feeds for key data sources
- Apply workflows to customize processing
- Monitor feed health in production
Related guides:
- Content Ingestion - Manual ingestion vs feeds
- Workflows and Processing - Process feed content
- Building Knowledge Graphs - Extract entities from feeds
- Production Architecture - Monitor feed health
Happy connecting! 🔌