{"id":2483,"date":"2026-05-28T04:28:55","date_gmt":"2026-05-28T04:28:55","guid":{"rendered":"https:\/\/tucumandevelopers.com\/index.php\/2026\/05\/28\/i-built-an-mcp-server-that-gives-ai-persistent-memory-of-your-sql-database\/"},"modified":"2026-05-28T04:28:55","modified_gmt":"2026-05-28T04:28:55","slug":"i-built-an-mcp-server-that-gives-ai-persistent-memory-of-your-sql-database","status":"publish","type":"post","link":"https:\/\/tucumandevelopers.com\/index.php\/2026\/05\/28\/i-built-an-mcp-server-that-gives-ai-persistent-memory-of-your-sql-database\/","title":{"rendered":"I built an MCP server that gives AI persistent memory of your SQL database"},"content":{"rendered":"<div>\n<div><\/div>\n<div data-article-id=\"3768018\" id=\"article-body\">\n<p>A while ago I tried to build a local coding assistant. I downloaded Qwen3, fired it up on my MacBook with 16GB of RAM, and within a day realized the output quality was nowhere close to Claude or GPT-5. The model could <em>fit<\/em>. It just couldn&#8217;t <em>compete<\/em>.<\/p>\n<p>So I changed the question.<\/p>\n<p>If I can&#8217;t make the model smarter on my hardware, can I make what I feed it smarter?<\/p>\n<h2> <a name=\"where-the-tokens-actually-go\" href=\"#where-the-tokens-actually-go\"> <\/a> Where the tokens actually go <\/h2>\n<p>I started watching where my Claude \/ Cursor \/ Copilot sessions actually spent their tokens. The surprise: most of it wasn&#8217;t reasoning. It was <strong>lookup<\/strong>.<\/p>\n<p>Every fresh chat about my company&#8217;s database re-discovered the same things:<\/p>\n<ul>\n<li>What does <code>status = 3<\/code> mean? (cancelled)<\/li>\n<li>How does <code>orders<\/code> join to <code>users<\/code>? (<code>orders.user_id \u2192 users.id<\/code>)<\/li>\n<li>What&#8217;s that cryptic <code>JobStatus<\/code> enum? (a dozen integer codes nobody remembers)<\/li>\n<\/ul>\n<p>The model figured it out, the session ended, and tomorrow it figured it out <em>again<\/em>. Same tokens, same latency, every single time. The expensive part of working with an AI wasn&#8217;t the thinking \u2014 it was re-teaching it things it had already learned yesterday.<\/p>\n<p>There&#8217;s a lot of attention right now on trimming AI <strong>output<\/strong> tokens (talk like a caveman, strip the pleasantries, etc.). But in my workflow the bigger leak was on the <strong>input<\/strong> side: paying full token cost every session to re-establish context that never changed.<\/p>\n<h2> <a name=\"memory-isnt-a-feature-its-an-architecture-question\" href=\"#memory-isnt-a-feature-its-an-architecture-question\"> <\/a> &#8220;Memory&#8221; isn&#8217;t a feature, it&#8217;s an architecture question <\/h2>\n<p>AI clients are starting to bolt on &#8220;memory&#8221; features. But they&#8217;re proprietary, opaque, and locked to one tool. Claude&#8217;s memory doesn&#8217;t help Cursor. Cursor&#8217;s doesn&#8217;t help Copilot. You can&#8217;t inspect it, you can&#8217;t share it with a teammate, and you can&#8217;t diff it.<\/p>\n<p>What I actually wanted was an <strong>explicit, inspectable, shareable context layer<\/strong> that <em>any<\/em> AI client could read deterministically \u2014 same answer every time, same file my team could hand off.<\/p>\n<p>I picked the highest re-learn cost in my world to start with: <strong>SQL databases.<\/strong><\/p>\n<h2> <a name=\"enter-amnesic\" href=\"#enter-amnesic\"> <\/a> Enter amnesic <\/h2>\n<p><a href=\"https:\/\/github.com\/SurajKGoyal\/amnesic\" target=\"_blank\" rel=\"noopener noreferrer\">amnesic<\/a> is an open-source MCP server that gives any AI client persistent semantic memory of your SQL databases. The name is ironic \u2014 it&#8217;s anything but amnesic. It remembers.<\/p>\n<p>You (or the AI) annotate a table or column once: <\/p>\n<div>\n<pre><code><span>db_annotate<\/span><span>(<\/span> <span>table<\/span><span>=<\/span><span>\"<\/span><span>orders<\/span><span>\"<\/span><span>,<\/span> <span>column<\/span><span>=<\/span><span>\"<\/span><span>status<\/span><span>\"<\/span><span>,<\/span> <span>column_description<\/span><span>=<\/span><span>\"<\/span><span>Order lifecycle state<\/span><span>\"<\/span><span>,<\/span> <span>enum_values<\/span><span>=<\/span><span>{<\/span><span>\"<\/span><span>1<\/span><span>\"<\/span><span>:<\/span> <span>\"<\/span><span>pending<\/span><span>\"<\/span><span>,<\/span> <span>\"<\/span><span>2<\/span><span>\"<\/span><span>:<\/span> <span>\"<\/span><span>shipped<\/span><span>\"<\/span><span>,<\/span> <span>\"<\/span><span>3<\/span><span>\"<\/span><span>:<\/span> <span>\"<\/span><span>cancelled<\/span><span>\"<\/span><span>,<\/span> <span>\"<\/span><span>4<\/span><span>\"<\/span><span>:<\/span> <span>\"<\/span><span>delivered<\/span><span>\"<\/span><span>},<\/span> <span>)<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>\u2026and it&#8217;s stored in a local SQLite file. Every future <code>db_get_schema<\/code> call merges those annotations back into the response \u2014 across sessions, across AI clients, forever: <\/p>\n<div>\n<pre><code><span>You<\/span><span>:<\/span> <span>How<\/span> <span>many<\/span> <span>cancelled<\/span> <span>orders<\/span> <span>this<\/span> <span>month<\/span><span>?<\/span> <span>AI<\/span><span>:<\/span> <span>[<\/span><span>calls<\/span> <span>db_get_schema<\/span><span>(<\/span><span>\"orders\"<\/span><span>)]<\/span> <span>\u2192<\/span> <span>status<\/span> <span>column<\/span><span>:<\/span> <span>enum<\/span> <span>{<\/span><span>\"3\"<\/span><span>:<\/span> <span>\"cancelled\"<\/span><span>,<\/span> <span>...}<\/span> <span>[<\/span><span>writes<\/span> <span>correct<\/span> <span>SQL<\/span> <span>immediately<\/span><span>,<\/span> <span>no<\/span> <span>re<\/span><span>-<\/span><span>discovery<\/span><span>]<\/span> <span>SELECT<\/span> <span>COUNT<\/span><span>(<\/span><span>*<\/span><span>)<\/span> <span>FROM<\/span> <span>orders<\/span> <span>WHERE<\/span> <span>status<\/span> <span>=<\/span> <span>3<\/span> <span>AND<\/span> <span>...<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>No re-explaining. No wasted turns. The annotation persisted.<\/p>\n<h2> <a name=\"the-technical-decisions-id-defend\" href=\"#the-technical-decisions-id-defend\"> <\/a> The technical decisions I&#8217;d defend <\/h2>\n<p>A few choices that might interest people building similar tools:<\/p>\n<h3> <a name=\"sqlite-fts5-over-a-vector-db\" href=\"#sqlite-fts5-over-a-vector-db\"> <\/a> SQLite FTS5 over a vector DB <\/h3>\n<p>I started with ChromaDB for search \u2014 &#8220;find the table that handles payments.&#8221; Then I ripped it out. SQLite&#8217;s built-in <strong>FTS5<\/strong> with BM25 ranking covers the &#8220;find the right table\/column&#8221; use case at <strong>zero dependency cost<\/strong>. No embeddings, no model download, no external service. For a tool that&#8217;s supposed to be a lightweight local layer, pulling in a 50MB+ vector stack was the wrong trade. <\/p>\n<div>\n<pre><code><span>db_search<\/span><span>(<\/span><span>\"<\/span><span>payment<\/span><span>\"<\/span><span>)<\/span> <span># \u2192 ranked: orders.payment_method, consumerpayments table, ... # all from a local FTS5 index, no network, no embeddings <\/span><\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<h3> <a name=\"twolayer-readonly-enforcement\" href=\"#twolayer-readonly-enforcement\"> <\/a> Two-layer read-only enforcement <\/h3>\n<p>amnesic connects to production databases, so the AI must <strong>never<\/strong> be able to mutate anything. Two independent layers:<\/p>\n<ol>\n<li> <strong>Static SQL analysis<\/strong> \u2014 reject anything that isn&#8217;t <code>SELECT<\/code> \/ <code>WITH<\/code>; catch write keywords, <code>SELECT ... INTO OUTFILE<\/code>, and writes smuggled inside CTEs.<\/li>\n<li> <strong>Transaction rollback<\/strong> \u2014 every query runs inside a transaction that&#8217;s immediately rolled back. Even if a write slipped past layer 1, nothing commits.<\/li>\n<\/ol>\n<p>Belt and suspenders. The AI shouldn&#8217;t be able to drop your table even by accident.<\/p>\n<h3> <a name=\"one-sqlite-file-per-connection\" href=\"#one-sqlite-file-per-connection\"> <\/a> One SQLite file per connection <\/h3>\n<p>Schema cache + annotations + FK relationship graph + the FTS5 index all live in one SQLite file per database connection. Portable, inspectable, <code>chmod 600<\/code>. Want to hand your accumulated knowledge to a teammate? It&#8217;s a single file.<\/p>\n<h3> <a name=\"data-minimization-as-a-side-effect\" href=\"#data-minimization-as-a-side-effect\"> <\/a> Data minimization as a side effect <\/h3>\n<p>A nice property fell out of the design: a well-annotated schema means the AI answers most questions from the <strong>local knowledge file<\/strong> \u2014 without ever querying the database. <em>&#8220;What does status=3 mean?&#8221;<\/em> resolves from the annotation. <em>&#8220;How do orders join users?&#8221;<\/em> resolves from the FK graph. That&#8217;s measurably less row data leaving your machine than a &#8220;naked&#8221; SQL MCP that runs <code>SELECT DISTINCT status FROM orders<\/code> every time it&#8217;s confused.<\/p>\n<h2> <a name=\"what-its-not\" href=\"#what-its-not\"> <\/a> What it&#8217;s <em>not<\/em> <\/h2>\n<ul>\n<li>It doesn&#8217;t make the model smarter.<\/li>\n<li>It doesn&#8217;t do natural-language-to-SQL.<\/li>\n<li>It&#8217;s not a replacement for execution-focused MCP servers \u2014 those handle query execution and live introspection well. amnesic&#8217;s only job is the <strong>persistence\/annotation layer<\/strong> I couldn&#8217;t find in any of them.<\/li>\n<\/ul>\n<h2> <a name=\"try-it\" href=\"#try-it\"> <\/a> Try it <\/h2>\n<div>\n<pre><code>pip <span>install <\/span>amnesic amnesic init <span># interactive setup wizard<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>Then add it to your AI client&#8217;s <code>mcp.json<\/code> and restart. Works with <strong>PostgreSQL, MySQL, MSSQL, and SQLite<\/strong>. MIT-licensed, on PyPI, and registered on the official Linux Foundation MCP Registry.<\/p>\n<p>GitHub: <a href=\"https:\/\/github.com\/SurajKGoyal\/amnesic\" target=\"_blank\" rel=\"noopener noreferrer\">github.com\/SurajKGoyal\/amnesic<\/a><\/p>\n<h2> <a name=\"the-takeaway\" href=\"#the-takeaway\"> <\/a> The takeaway <\/h2>\n<p>Not every AI problem needs a smarter model. Sometimes the win is an external context layer that&#8217;s <strong>deterministic, inspectable, and shared<\/strong> \u2014 so the model never has to learn the same thing twice.<\/p>\n<p>I&#8217;d love feedback, especially on the read-only enforcement \u2014 that&#8217;s the part that has to be bulletproof. Issues and PRs welcome.<\/p>\n<\/p><\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>Fuente: <a href=\"https:\/\/dev.to\/surajkgoyal\/i-built-an-mcp-server-that-gives-ai-persistent-memory-of-your-sql-database-3d2a\">Art\u00edculo original<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A while ago I tried to build a local coding assistant. I downloaded Qwen3, fired it up on my MacBook with 16GB of RAM, and within a day realized the output quality was nowhere close to Claude or GPT-5. The model could fit. It just couldn&#8217;t compete. So I changed the question. If I can&#8217;t [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2482,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[41],"tags":[],"class_list":["post-2483","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devto"],"jetpack_publicize_connections":[],"_links":{"self":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts\/2483","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/comments?post=2483"}],"version-history":[{"count":0,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts\/2483\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/media\/2482"}],"wp:attachment":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/media?parent=2483"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/categories?post=2483"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/tags?post=2483"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}