<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Mihai Criveti on Medium]]></title>
        <description><![CDATA[Stories by Mihai Criveti on Medium]]></description>
        <link>https://medium.com/@crivetimihai?source=rss-7648462e917d------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/0*OSaUfTCLmi6Wfd1B</url>
            <title>Stories by Mihai Criveti on Medium</title>
            <link>https://medium.com/@crivetimihai?source=rss-7648462e917d------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Mon, 22 Jun 2026 01:01:47 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@crivetimihai/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[MCP Gateway: The Missing Proxy for AI Tools]]></title>
            <link>https://medium.com/@crivetimihai/mcp-gateway-the-missing-proxy-for-ai-tools-2b16d3b018d5?source=rss-7648462e917d------2</link>
            <guid isPermaLink="false">https://medium.com/p/2b16d3b018d5</guid>
            <category><![CDATA[co-pilot]]></category>
            <category><![CDATA[mcp-client]]></category>
            <category><![CDATA[ai-agent]]></category>
            <category><![CDATA[mcp-protocol]]></category>
            <category><![CDATA[mcp-server]]></category>
            <dc:creator><![CDATA[Mihai Criveti]]></dc:creator>
            <pubDate>Sat, 07 Jun 2025 16:52:24 GMT</pubDate>
            <atom:updated>2025-08-06T20:42:20.620Z</atom:updated>
            <content:encoded><![CDATA[<h3><strong>ContextForge MCP Gateway: The Missing Proxy &amp; Registry for AI Tools</strong></h3><p>AI agents and tool integration are exciting — until you actually try to connect them. Different authentication systems (or none), fragmented documentation, and incompatible protocols quickly turn what should be simple integrations into debugging nightmares.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*HIpK3U3PaBEZi-NyIBv_6A.png" /><figcaption>MCP Gateway is a smart proxy that sits between your MCP Clients and MCP Servers</figcaption></figure><p>I’ve released <a href="https://github.com/IBM/mcp-context-forge">ContextForge MCP Gateway</a> as open source to solve this problem by sitting between your AI clients and tool servers, giving you an open source, clean, secure endpoint for everything — and supports both REST and MCP upstream and downstream protocols (including stdio, SSE and Streamable HTTP, with auth). If you find it useful, <a href="https://github.com/IBM/mcp-context-forge"><strong>leave us a star on GitHub</strong></a><strong> </strong>⭐!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*PKje5x04p2covp4tK5UelQ.gif" /><figcaption>ContextForge MCP Gateway — Adding a MCP Server and Creating a Virtual Gateway</figcaption></figure><p>In this article, I’ll share how to deploy MCP Gateway along with 2 MCP Servers (time and github), federate them in the gateway, and connect to them using an MCP Client (Visual Studio Code — Copilot).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*rydclhVrYam79m0zLIl5ug.png" /></figure><h3>The MCP Integration Problem</h3><p>The <a href="https://modelcontextprotocol.io/">Model Context Protocol</a> promises to standardize how AI models call external tools, but the reality is messy. The ecosystem now has over 15,000 MCP-compatible servers, but they’re anything but uniform:</p><p><strong>Transport chaos and incomplete implementations</strong>: Some servers only support <a href="https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#stdio">STDIO</a>, others stream over <a href="https://modelcontextprotocol.io/docs/concepts/transports#server-sent-events-sse">Server-Sent Events</a>, and a few expose <a href="https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#streamable-http">Streamable HTTP</a> endpoints. They can’t talk to each other without custom adapters. And while the MCP specification evolves (now deprecating SSE) — MCP clients and servers are slow to catch up.</p><p><strong>Security gaps</strong>: Many test servers skip authentication entirely or use weak schemes. Depending on what the client supports, you’ll want to either use a stdio wrapper, SSE + Bearer JWT auth, or full flown OAuth.</p><p><strong>Writing new MCP Servers</strong>: Your existing API endpoints aren’t yet available as MCP Servers. You have to write new servers, and test them.</p><p><strong>Everything lives somewhere different</strong>: Your prompt library runs on Server A, the vector database on Server B, and your custom tools on Server C. Managing URLs, keys, and retry logic across all of them becomes a full-time job.</p><p><a href="https://www.slideshare.net/slideshow/contextforge-mcp-gateway-the-missing-proxy-for-ai-agents-and-tools/280961197">https://www.slideshare.net/slideshow/contextforge-mcp-gateway-the-missing-proxy-for-ai-agents-and-tools/280961197</a></p><h3>How MCP Gateway Fixes This</h3><p>Instead of handling retry logic and managing multiple MCP servers directly in your agent or client application, MCP Gateway centralizes all that complexity. You can create multiple virtual servers for different clients or use cases, each with their own tool configurations and access controls.</p><p>The <a href="https://ibm.github.io/mcp-context-forge/using/mcpgateway-wrapper/">MCP Gateway Wrapper</a> allows you to connect to the gateway securely, using a JWT token, while exposing a local STDIO server.</p><p>The gateway acts as a smart proxy that normalizes everything behind a single endpoint:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*IEBA7zdnU96QnoZthLtBFg.png" /></figure><h3>One Consistent Interface</h3><p>The gateway converts STDIO, SSE, and HTTP into consistent HTTPS+JSON-RPC, so your clients only need to know one protocol.</p><p>And your MCP Clients — Agents such as Langchain, Autogen, Crew.AI — or Visual Studio code plugins (ex: Microsoft Copilot) can connect over SSE + JWT Auth, or via STDIO locally (using mcpgateway-wrapper to connect securely to the Gateway).</p><h3>Complete Tool Discovery and Debugging</h3><p>The gateway automatically discovers all connected servers and presents every tool, prompt, and resource in one catalog.</p><p>Tools can be enabled, disabled and tested — and JSON Schema and tool description easily viewed.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*byfjvc5cs9I7bU4VdfMJ5Q.png" /></figure><h3>Non-MCP API Support</h3><p>Wrap any REST API endpoint and expose it as a fully-typed MCP tool with automatic retries and schema validation.</p><h3>Built-in Observability</h3><p>Every API call is timed and logged, so you can track performance and debug issues without external monitoring tools. Both per-tool, per server and aggregated metrics are available.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*5BqlQb5Mk5X0SdkC-qUYSw.png" /></figure><h3>Getting Started</h3><p>The gateway ships as a single Docker container or pip package with no external dependencies-just a local SQLite database.</p><h3>Docker / Podman</h3><pre>docker run -d --name mcpgateway \<br>  -p 4444:4444 \<br>  -e MCPGATEWAY_UI_ENABLED=true \<br>  -e MCPGATEWAY_ADMIN_API_ENABLED=true \<br>  -e HOST=0.0.0.0 \<br>  -e JWT_SECRET_KEY=my-test-key \<br>  -e BASIC_AUTH_USER=admin \<br>  -e BASIC_AUTH_PASSWORD=changeme \<br>  -e AUTH_REQUIRED=true \<br>  -e DATABASE_URL=sqlite:///./mcp.db \<br>  ghcr.io/ibm/mcp-context-forge:0.5.0</pre><h3>Python Package</h3><pre>pip install mcp-contextforge-gateway<br><br># Enable the visual Admin UI (true/false)<br>export MCPGATEWAY_UI_ENABLED=true<br><br># Enable the Admin API endpoints (true/false)<br>export MCPGATEWAY_ADMIN_API_ENABLED=true<br><br>BASIC_AUTH_PASSWORD=password mcpgateway --host 127.0.0.1 --port 4444</pre><p>Once running, you’ll have access to <a href="http://localhost:4444">http://localhost:4444</a> with:</p><ul><li><strong>Admin Dashboard</strong> (/admin) - Manage servers and tools through a web interface.</li><li><strong>API Documentation</strong> (/docs- and /redoc) Interactive Swagger documentation.</li><li><strong>Version, Health and Configuration </strong>(/version and /health) — get version, configuration and debugging information. Requires auth (login to admin page first, or use as API to retrieve JSON).</li><li><strong>JSON-RPC Endpoint</strong> (/rpc) - RPC Endpoint.</li><li><strong>Metrics</strong> (/metrics) - Performance and usage statistics.</li></ul><p>Detailed deployment across Containers, Docker, Compose, Kubernetes, OpenShift, Minikube, Helm, Code Engine, AWS and Azure is available through the <a href="https://ibm.github.io/mcp-context-forge/deployment/">project documentation page</a>.</p><h3>Integration with VS Code and GitHub Copilot</h3><p>One of the most practical applications is connecting MCP Gateway to GitHub Copilot in VS Code. This gives Copilot access to all your tools through a single, secure connection.</p><p>Generate a JWT token and test it:</p><pre>python -m mcpgateway.utils.create_jwt_token \<br>  --username admin --exp 0 --secret my-test-key<br><br>curl -s -H &quot;Authorization: Bearer $MCPGATEWAY_BEARER_TOKEN&quot; \<br>     http://localhost:4444/version | jq</pre><p>Spin up a couple of MCP Servers:</p><pre># Install npx (requires node.js / npm)<br>npm install -g npx<br><br># Instal uvenv<br>pip install uvenv<br><br># Deploy mcp-server-git (default port 8000)<br>npx -y supergateway --stdio &quot;uvenv run mcp-server-git&quot;<br><br># Deploy mcp_server_time with a local timezone (port 8001)<br>npx -y supergateway \<br>  --stdio &quot;uvenv run mcp_server_time -- --local-timezone=Europe/Dublin&quot; \<br>  --port 8001</pre><p>Add the MCP Servers to your gateway, under “Gateways (MCP Registry)”.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*MnMcfRMQVrhbDudkGB9Yag.png" /></figure><p>Now, create a Virtual Server under the Servers Catalog tab, adding just the tools you want to share with your MCP Clients.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1FV2R2oUifdM8WUuAqN3PQ.png" /></figure><p>Then, enable MCP support in VS Code by adding &quot;chat.mcp.enabled&quot;: true to your GitHub Copilot Chat settings.json. Then add a mcp configuration block as described in VS Code the <a href="https://code.visualstudio.com/docs/copilot/chat/mcp-servers">documentation</a>:</p><pre>{<br>  &quot;servers&quot;: {<br>    &quot;gateway&quot;: {<br>      &quot;type&quot;: &quot;sse&quot;,<br>      &quot;url&quot;: &quot;http://localhost:4444/servers/1/sse&quot;,<br>      &quot;headers&quot;: {<br>        &quot;Authorization&quot;: &quot;Bearer YOUR_JWT_TOKEN&quot;<br>      }<br>    }<br>  }<br>}</pre><p>Press Ctrl + Alt + I to open Copilot Chat, and click on Tools. You’ll see all your gateway-managed tools available in the Tools panel:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*AEu9Vm7ttSGNhDq8Pc7V9w.png" /></figure><h3>Running the STDIO Wrapper (mcpgateway.wrapper)</h3><p>Some AI agents (such as Claude Desktop, LangChain CLI, or custom shell-based tools) can’t authenticate over SSE. To support these securely, MCP Gateway offers a wrapper module that connects to the Gateway using <strong>JWT-authenticated HTTP</strong> and exposes a local <strong>STDIO</strong> interface. See <a href="https://ibm.github.io/mcp-context-forge/using/mcpgateway-wrapper/">mcpgateway.wrapper documentation</a> for more info.</p><p>For example, Claude Desktop and similar tools can define this wrapper in their config:</p><pre>{<br>  &quot;mcpServers&quot;: {<br>    &quot;mcpgateway-wrapper&quot;: {<br>      &quot;command&quot;: &quot;python3&quot;,<br>      &quot;args&quot;: [&quot;-m&quot;, &quot;mcpgateway.wrapper&quot;],<br>      &quot;env&quot;: {<br>        &quot;MCP_AUTH_TOKEN&quot;: &quot;&lt;your-token&gt;&quot;,<br>        &quot;MCP_SERVER_CATALOG_URLS&quot;: &quot;http://localhost:4444/servers/1&quot;<br>      }<br>    }<br>  }<br>}</pre><p>You can now use local AI agents or dev tools that speak STDIO to access all tools exposed via your gateway’s virtual servers — authenticated, observable, and fully compatible.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*9lF7iJTKOps2ZWIiRKhr3w.png" /></figure><h3>Troubleshooting</h3><p>You can use MCP Inspector to access the Global Tools catalog: <a href="http://localhost:4444/tools,.">http://localhost:4444/tools</a> — then each of the individual virtual servers you’ve created. Ex: <a href="http://localhost:4444/servers/1/sse">http://localhost:4444/servers/1/sse</a></p><pre>npx @modelcontextprotocol/inspector</pre><h3>Real-World Impact</h3><p>Instead of maintaining separate connections to a dozen different tool servers, each with its own authentication, error handling, and monitoring, you manage one gateway instance. When you need to add a new tool, you register it with the gateway rather than updating every client.</p><p>This architectural change becomes more valuable as your tool ecosystem grows. Whether you’re building a personal AI assistant or deploying enterprise-scale automation, MCP Gateway eliminates the integration overhead that typically consumes more time than the actual feature work.</p><p>The gateway is production-ready with Kubernetes support, Helm charts, and comprehensive monitoring. You can start with the simple Docker setup and scale up as needed.</p><h3>Try It Out</h3><ul><li><strong>Source code</strong>: <a href="https://github.com/IBM/mcp-context-forge">GitHub repository</a></li><li><strong>Documentation</strong>: <a href="https://ibm.github.io/mcp-context-forge/">Project docs</a></li><li><strong>Package</strong>: <a href="https://pypi.org/project/mcp-contextforge-gateway/">PyPI listing</a></li></ul><p>The Model Context Protocol represents an important step toward standardized AI tool integration, but the implementation reality is still fragmented and the protocol and MCP ecosystem is going to evolve over time. MCP Gateway helps bridge that gap, turning different MCP servers into a unified, secure, and observable system.</p><p>Give it a try-you’ll spend less time on plumbing and more time building the AI experiences that matter.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=2b16d3b018d5" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Reducing AI Hallucinations with RAG: Writing an entire article from a Podcast Episode]]></title>
            <link>https://medium.com/@crivetimihai/reducing-ai-hallucinations-with-rag-writing-an-entire-article-from-a-podcast-episode-40296867eb8a?source=rss-7648462e917d------2</link>
            <guid isPermaLink="false">https://medium.com/p/40296867eb8a</guid>
            <category><![CDATA[rags]]></category>
            <category><![CDATA[huggin-face]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[llama-2]]></category>
            <category><![CDATA[vector-database]]></category>
            <dc:creator><![CDATA[Mihai Criveti]]></dc:creator>
            <pubDate>Sun, 15 Oct 2023 00:06:08 GMT</pubDate>
            <atom:updated>2023-10-15T00:39:39.184Z</atom:updated>
            <content:encoded><![CDATA[<h3>Reducing AI Hallucinations with RAG: Automatic Podcast to blog</h3><p><em>Generating a medium.com article automatically from a voice where Mihai Criveti featured on IBM Fellow Jerry Cuomo’s Art of AI Podcast.</em></p><p>I’m hosting a hands-on workshop on <a href="https://www.linkedin.com/feed/update/urn:li:activity:7117408009814728704/">Practical GenAI</a> with HuggingFace 🤗 and Python models soon, and decided to challenge myself a bit by generating this entire post using IBM’s <a href="https://www.ibm.com/products/watsonx-ai">watsonx.ai, </a>LLAMA2 and a Retrieval Augmented Generation platform I’m building. This content is based only on a 15 minute podcast where I discuss Large Language Models and Hallucinations with IBM Fellow and VP of Technology, Jerry Cuomo.</p><p>I’ve already hand-written an article on <a href="https://medium.com/@crivetimihai/understand-genai-large-language-model-limitations-and-how-retrieval-augmented-generation-can-help-d019e394eeb7">Reducing LLM Hallucinations with RAG </a>that the podcast was based on, which is not in the training dataset of the models used. This takes those techniques one step further with Reciprocal Rank Fusion, Hybrid Searching, and more.</p><p>The content below is generated by watsonx using LLAMA2, including all the Questions and Answers, which have also been auto-generated. The source content is the “<a href="https://listen.casted.us/public/95/The-Art-of-AI-7315085b/0a99a126/share/f79946ba">Art of AI Episode 4: </a>Hallucinations, with guest <a href="https://www.linkedin.com/in/crivetimihai/">Mihai Crivet</a>i“ — a 15 minute podcast where I discuss Large Language Models and Hallucinations with IBM Fellow and VP of Technology, <a href="https://www.linkedin.com/in/jerry-cuomo/">Jerry Cuomo</a> — and has also been converted to text using Speech Recognition model.</p><p>The result was formatted automatically, with only a few links added to Jerry’s Podcast, the hand-written article, and our LinkedIn profile. With further metadata enhancement, or connecting to an internet search platform for further RAG context infusion, links could also be generated automatically.</p><p>Subscribe to my YouTube Channel: <a href="https://www.youtube.com/channel/UCznS5usUIQ-5W-yksOhdbng">Practical Cloud and AI with Mihai</a> — for more videos and discussions on AI, RAG, reducing Hallucinations and more. I’ll do a deep dive there shortly!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*dQeOG1NLgcYsO5noqOkohA.png" /></figure><h3>Executive Summary</h3><p>In “The Art of AI,” Episode 4, hosts Jerry Cuomo and Mihai Criveti discuss the challenges posed by hallucinations in Large Language Models (LLMs). As a solution, Criveti suggests methods such as few-shot prompting and retrieval augmented generation (RAG), which involve providing the model with relevant context and information to improve the accuracy of its responses.</p><p>Additionally, they emphasize the importance of attribution and transparency in AI decision-making processes.</p><p>Finally, they mention related resources, such as Criveti’s paper on Understanding Generative AI, and invite listeners to continue exploring these topics further. Key takeaways include the potential benefits of RAG and the significance of developing explainable AI systems.</p><h3>Participant Names</h3><p><em>The key participants in this episode of The Art of AI are:</em></p><ul><li><a href="https://www.linkedin.com/in/jerry-cuomo/"><strong>Jerry Cuomo</strong></a><strong> </strong>— host, IBM Fellow, and VP of Technology</li><li><a href="https://www.linkedin.com/in/crivetimihai/">Mihai Criveti </a>— guest, STSM &amp; Principle Architect, OIC Vice Chair — Technology, ScribeFlow GenAI Lead, and Podcast Host.</li></ul><h3>Action Items</h3><p><em>Here are some potential actions or tasks for listeners based on the contents of the podcast:</em></p><ul><li>Read <a href="https://www.youtube.com/channel/UCznS5usUIQ-5W-yksOhdbng">Mihai Criveti</a>’s article on Medium, “<a href="https://medium.com/@crivetimihai/understand-genai-large-language-model-limitations-and-how-retrieval-augmented-generation-can-help-d019e394eeb7">Understanding GenAI Large Language Model Limitations, and How Retrieval Augmented Generation Can Help</a>,” which is mentioned in the podcast.</li><li>Research and learn more about Large Language Models (LLMs), their limitations, and how retrieval-augmented generation can help address some of these limitations.</li><li>Consider incorporating the approach of providing contextual information and relevant databases or knowledge bases when interacting with LLMs to improve the quality and accuracy of their responses.</li><li>Explore the idea of attribute able AI and understand how it works.</li><li>Look into the possibility of Fine-Tuning LLMs and compare the benefits and drawbacks of doing so versus other methods of providing context.</li><li>Investigate the cost and environmental impact of training and utilizing LLMs and consider whether there are ways to mitigate these factors.</li></ul><h3>Relevant Questions and Answers</h3><p><em>All the questions (and answers) below have been generated by AI, no manual prompting was used.</em></p><h3>How can you get more information</h3><p>Mihai Criveti’s article “Understanding GenAI Large Language Model Limitations, and How Retrieval Augmented Generation Can Help” provides insight into utilizing large language models (LLMs) effectively and responsible.</p><p><strong>He highlights three essential strategies:</strong></p><ol><li>prompt engineering, where instructions are given to the model along with relevant context;</li><li>retrieval augmented generation, involving the integration of a knowledge base or database search into the model’s prompt;</li><li>and vector databases, which enable flexible searches based on similarity rather than exact keywords.</li></ol><p>These methods aim to minimize hallucinations — the model’s inclination to create responses without proper context or basis in reality — and enhance the accuracy of the produced outputs by incorporating context.</p><p>Additionally, they offer advantages like cost efficiency and increased speed compared to entirely retraining the model or fine-tuning it. By implementing these techniques, it is feasible to improve the performance of LLMs and move toward Explainable AI.</p><h3>What is a large language model</h3><p>A large language model (LLM) is a type of artificial intelligence (AI) model that processes natural language text. It is trained on vast quantities of text data and uses machine learning algorithms to learn patterns and relationships within the language. These models can generate human-like text, complete sentences, summarize information, and answer questions.</p><p>However, despite their capabilities, LLMs have limitations. They can produce erroneous or misleading outputs, known as hallucinations, due to their lack of reasoning abilities and limited context awareness. Additionally, they have constraints regarding the amount of input data they can handle, leading to reduced quality responses beyond a certain threshold.</p><p>To address these limitations, techniques such as prompt engineering, retrieval-augmented generation, and vector databases can enhance the accuracy and relevance of LLM responses.</p><p>Prompt engineering involves providing specific instructions to guide the model’s output, while retrieval-augmented generation incorporates external knowledge sources to expand the model’s context.</p><p>Vector databases utilize vector space representations of words to identify related documents and improve the efficiency of searches. By employing these strategies, organizations can leverage LLMs more effectively and responsibly in various applications, including customer service, writing assistance, and language translation.</p><h3>What are some of its limitations</h3><p>Some limitations of large language models (LLMs) include:</p><p><strong>1. Models only act on language:</strong> They cannot handle PDF files, Word documents, binary files, images, or anything other than text.</p><p><strong>2. Limited domain expertise</strong>: Although LLMs can generate text similar to human language, they lack domain-specific knowledge; therefore, their responses may need to be corrected regarding specialized domains such as medicine, law, etc.</p><p><strong>3. No reasoning or deduction abilities:</strong> LLMs lack reasoning skills and cannot perform mathematical calculations or operations. They rely solely on statistical patterns identified during training.</p><p><strong>4. No direct access to external sources: </strong>LLMs do not directly interact with the Internet or utilize outside data beyond what was included in their initial training dataset. As a result, they are oblivious to current events or fresh facts not incorporated into their instructional material.</p><p><strong>5. Hallucination generations:</strong> Due to the absence of context, the model may produce erroneous outputs known as hallucinations. These hallucinations can range from grammatical mistakes to illogical statements or blatantly wrong assertions.</p><p><strong>6. Factual inaccuracies: </strong>Sometimes, LLMs offer replies that appear believable but contain inaccurate details. They draw upon the information given during training, which may include inaccuracies present in the source materials. Therefore, factual accuracy cannot be guaranteed.</p><p><strong>7. Contextual misunderstandings: </strong>Without proper guidance, LLMs risk producing responses disconnected from the situation due to their restricted capacity for comprehending context. Clear directions must be provided through prompt engineering to address this limitation.</p><p><strong>8. Memory restrictions:</strong> Typically, LLMs can accept around 4,096 tokens or 3,000 words as input before performance declines. Consequently, offering extensive context becomes impractical without careful selection of essential information.</p><p>In summary, despite their impressive capabilities, LLMs face various constraints. To enhance their effectiveness, strategies like prompt engineering, retrieval-augmented generation, and thoughtful context injection become vital.</p><h3>What can cause them to produce hallucinations</h3><p><em>What causes hallucinations in Large Language Models (LLMs)?</em></p><p>Hallucinations in LLMs are caused by several factors, including:</p><p><strong>Lack of Context: </strong>Sometimes, the model lacks context, leading to misunderstandings and inaccurate responses.</p><p><strong>Limited Training Data: </strong>LLMs are trained on vast amounts of data; however, the quality and accuracy of this data can vary greatly. As a result, biases and inconsistencies can arise in the training data, causing hallucinations.</p><p><strong>Prediction-Based Model: </strong>LLMs rely on predictions rather than actual learning, resulting in inaccuracies and hallucinations.</p><p><strong>Conversational Memory Limits: </strong>There are restrictions to the conversational memory of LLMs since they can only store a limited amount of information from earlier interactions. Therefore, their ability to remember crucial details may be restricted, producing hallucinations.</p><p><strong>Poor Quality Prompts: </strong>Providing poorly constructed prompts can lead to low-quality replies from LLMs, potentially containing hallucinations.</p><p><strong>Misunderstanding Questions: </strong>LLMs occasionally comprehend queries incorrectly, triggering erroneous and hallucinatory responses.</p><p><strong>Overfitting: </strong>Due to overfitting, LLMs might memorize certain patterns in the training data rather than gain generalizable knowledge, contributing to hallucinations.</p><h3>What can be done to mitigate this problem</h3><p>What can be done to alleviate this problem?’ refers to the issue of hallucinations in Large Language Models (LLMs). According to the conversation, hallucinations in LLMs occur due to their limitation in reasoning, learning, and utilizing external sources. Instead, they rely solely on producing coherent and fluent texts based on patterns recognized within their training data. As a result, the model may create contradictory or wrong statements, such as claiming that a tomato is both a fruit and a vegetable.</p><p>To address this challenge, several methods can be employed to improve the quality of answers provided by LLMs. One approach involves supplying clear and concise prompts, including sufficient background information to assist the model in comprehending the situation.</p><p>Another strategy entails incorporating <strong>retrieval-augmented generation</strong>, which enables searching a knowledge base for pertinent articles and integrating them into the dialogue to enhance the model’s reaction further. Providing instances and restricting inputs to 4, 096 tokens or fewer is recommended since the quality deteriorates beyond that point.</p><p>Furthermore, fine-tuning the model is feasible, although it can be costly and yield mixed outcomes, particularly when dealing with real-time data. Therefore, offering adequate context in each cue remains essential.</p><p>Additionally, it’s crucial to remember that attributing and explaining AI remain vital steps toward reducing hallucinations. By furnishing users with precise details regarding the source of the solution, such as a particular knowledge base document, PDF file, or database entry, a starting point for explainable AI and content grounding can be established, contributing to minimizing hallucinations significantly.</p><h3>How can we get more context from the model</h3><p>In summary, large language models (LLMs) have several limitations, such as not being able to reason, use logic, or access external information. They are also unable to comprehend context, leading to hallucinations or incorrect outputs.</p><p>To mitigate these issues, providing context through strategies such as writing clear prompts, injecting context into the prompt, utilizing retrieval-augmented generation, and fine-tuning models can improve response quality. However, there are tradeoffs between quality and cost, and finding the optimal balance is crucial.</p><h3>What are some examples of this</h3><p>Some instances of hallucinations in extensive language models (LLMs) include producing erroneous or misleading outputs, such as claiming that the sky is purple or that a tomato is both a fruit and a vegetable. Another illustration is a reaction that negates the request, such as getting a bad review after asking for a favorable assessment of a book.</p><p>Additionally, LLMs may offer factually wrong replies, such as stating that someone has authored 173 books when they have only written 173 articles or patents. To address these concerns, methods including prompt design, retrieval-augmented production, and vector databases can be employed.</p><p>By giving precise directions and enough background, prompt engineering may assist in ensuring that the model produces truthful and valuable responses.</p><h3>Why is this important</h3><p><em>The Importance of Understanding the Limitations of Large Language Models</em></p><p>In the field of artificial intelligence, large language models (LLMs) have gained significant attention due to their ability to generate human-like language outputs. However, these models are not without their limitations, and it is crucial to recognize these limitations to avoid misusing or misunderstanding their capabilities. In this essay, we will explore the importance of understanding the limitations of LLMs and how doing so can lead to more responsible and effective usage of these models.</p><p>Firstly, it is essential to comprehend what LLMs truly are. As stated by Mihai Criveti, a renowned expert in the field, LLMs are merely models that act on language and nothing else. They cannot process images, videos, or any other forms of media; they solely operate on text inputs.</p><p>Additionally, these models lack reasoning abilities and do not utilize external sources such as the internet during their processing. Instead, they rely heavily on statistical predictions based on patterns found within the training data. Therefore, it is inaccurate to assume that LLMs possess intelligent thought processes or can engage in independent learning.</p><p>Another critical limitation of LLMs is their tendency to produce hallucinations, which refer to outputs that are either incorrect or utterly nonsensical. These hallucinations occur due to the model’s reliance on statistical probabilities rather than genuine understanding.</p><p>As highlighted by Criveti, LLMs frequently fail to grasp context, leading to bizarre statements such as “the sky is purple.” To mitigate this challenge, proper contextualization and priming are vital. By supplying adequate background information and specifying the desired topic, users can increase the likelihood of receiving relevant and accurate responses from LLMs.</p><p>Furthermore, it is crucial to acknowledge that LLMs do not inherently learn from interactions. Although some platforms may claim that their models adapt to user input, this statement is largely misleading. In reality, the model’s performance may deteriorate over time since it relies on pre-existing statistics and lacks true learning capacities.</p><p>Conversely, humans possess the unique ability to develop new connections and comprehend novel ideas through experience and education. Thus, it is unfounded to compare LLMs to human cognition.</p><p>To address the challenges posed by LLMs’ limitations, innovative techniques such as few-shot prompting and retrieval augmented generation (RAG) have emerged. RAG, specifically, employs vector databases to enhance conventional searches, allowing for more flexible and inclusive queries. By integrating tools like RAG, users can bolster the accuracy and relevancy of LLMs’ outputs.</p><h3>What other approaches could be used</h3><p><em>Other approaches that could be used to improve the quality of responses generated by large language models include:</em></p><p><strong>Prompt Engineering: </strong>Providing clear and concise prompts that contain sufficient context can help elicit better responses from the model. This includes breaking down complicated tasks into smaller parts and providing relevant information along with the query.</p><p><strong>Few-Shot Learning: </strong>Providing multiple instances or illustrations that demonstrate the desired outcome can assist the model in producing superior replies. This strategy enables the model to learn from fewer examples and deliver appropriate outcomes.</p><p><strong>Vector Database:</strong> Using vector databases allows for quicker and more effective searches of vast volumes of data. It converts papers into numerical representations, enabling fuzzy searches and the identification of comparable vectors rather than precise terms. As a result, vector databases may enhance the precision and pertinence of retrieved material.</p><p><strong>Conversational Memory: </strong>By incorporating prior dialogue history into the chat, conversational memory enables the model to comprehend the context of a discussion better. Doing so improves the likelihood of receiving suitable and cohesive replies from the model.</p><p><strong>Fine-Tuning: </strong>Although costly and resource-intensive, fine-tuning models can improve their accuracy and effectiveness on particular jobs. With adequate funds and computing power, fine-tuning can boost performance.</p><p><strong>Contextual Understanding</strong>: Ensuring that the model possesses contextual awareness and understands the intent behind a request is crucial. One method to achieve this is to specify that the model won’t create anything it isn’t familiar with. Instead, it would rely on reliable sources.</p><p><strong>Retrival Augemented Generation:</strong> RAG combines retrieval and generation capabilities to produce higher-quality responses. It involves inserting pertinent articles or excerpts from a database into the model’s prompt to offer context and guide the creation of improved responses.</p><p><strong>Knowledge Base Integration:</strong> Combining a big language model with a structured repository of information can increase its capacity to produce insightful and exact answers. The model can access relevant details and offer users more thorough and informed responses by integrating a knowledge base.</p><h3>What is the benefit of vector databases</h3><p><em>Benefits of Vector Databases: Improved Search Functionality and Explainable AI</em></p><p>Vector databases offer various benefits over traditional SQL databases, particularly when utilized in conjunction with large language models (LLMs) for natural language processing tasks. One significant advantage is the ability to perform efficient similarity searches, which enhances the effectiveness of the model in generating relevant and accurate responses.</p><p>Unlike traditional databases that rely solely on exact keyword matching, vector databases employ vectorization techniques to transform texts into numerical representations, enabling the identification of semantically related concepts. As a result, queries yield more precise and diverse outcomes, better addressing users’ needs and minimizing hallucinations typically associated with LLMs.</p><p>Moreover, incorporating vector databases contributes to developing explainable AI systems since it allows for tracing back the sources used to inform generated responses, promoting transparency and trustworthiness in AI interactions. By integrating vector databases into their workflows, developers can create more reliable and accountable language models, leading to improved overall performance in natural language processing applications.</p><h3>What is the last word of the podcast</h3><p><em>The final statement of the provided material is: “See you again on an upcoming episode.” Therefore, the ultimate word of the podcast is “episode”.</em></p><h3>What is the name of the podcast Mihai was on</h3><p>The name of the podcast is <a href="https://listen.casted.us/public/95/The-Art-of-AI-7315085b/0a99a126/share/f79946ba">The Art of AI.</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=40296867eb8a" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Understand GenAI Large Language Model limitations, and how Retrieval Augmented Generation can help]]></title>
            <link>https://medium.com/@crivetimihai/understand-genai-large-language-model-limitations-and-how-retrieval-augmented-generation-can-help-d019e394eeb7?source=rss-7648462e917d------2</link>
            <guid isPermaLink="false">https://medium.com/p/d019e394eeb7</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[genai]]></category>
            <category><![CDATA[large-language-models]]></category>
            <category><![CDATA[vector-database]]></category>
            <dc:creator><![CDATA[Mihai Criveti]]></dc:creator>
            <pubDate>Sat, 09 Sep 2023 06:36:32 GMT</pubDate>
            <atom:updated>2023-10-15T01:13:47.270Z</atom:updated>
            <content:encoded><![CDATA[<h4>Add context from private data and documents to GenAI LLMs to reduce hallucinations and increase performance through Retrieval Augmented Generation.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*k34NYOsRRbmLU8t6BkfQkA.png" /><figcaption>Use Cases for Large Language Models</figcaption></figure><p>This article is featured on the <a href="https://listen.casted.us/public/95/The-Art-of-AI-7315085b/0a99a126/share/f79946ba">Hallucinations episode from “The Art of AI”</a> podcast, where host <a href="https://www.linkedin.com/in/jerry-cuomo/">Jerry Cuomo</a> has a conversation <a href="https://www.linkedin.com/in/crivetimihai/">Mihai Criveti</a> on practical approaches to reduce LLM hallucinations.</p><p>I’ve also written an<a href="https://medium.com/@crivetimihai/reducing-ai-hallucinations-with-rag-writing-an-entire-article-from-a-podcast-episode-40296867eb8a"> ‘AI Version’ of this article</a> here, written entirely by AI, using advanced RAG and hallucination reduction techniques, using only the audio podcast as input. Check it out!</p><p><strong>Key use cases for Large Language Models include:</strong></p><ul><li><strong>Generation</strong>: LLMs can be used to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. For example, LLMs can be used to generate realistic dialogue for chatbots, write news articles, or even create poems.</li><li><strong>Summarization</strong>: LLMs can be used to summarize text, extract the main points of an article or document, and create a shorter version that is still accurate and informative. For example, LLMs can be used to summarize research papers, news articles, or even books.</li><li><strong>Classification</strong>: LLMs can be used to classify text, identify the topic of a document, and determine whether it is positive or negative, factual or opinion, etc. For example, LLMs can be used to classify customer reviews, social media posts, or even medical records.</li><li><strong>Extraction</strong>: LLMs can be used to extract information from text, identify specific entities or keywords, and create a table or list of the extracted information. For example, LLMs can be used to extract contact information from a business card, product information from a website, or even scientific data from a research paper.</li><li><strong>Q&amp;A:</strong> LLMs can be used to answer questions in an informative way, even if they are open ended, challenging, or strange. For example, LLMs can be used to answer questions about a particular topic, provide customer support, or even generate creative text formats of text content.</li></ul><h4>While Generative AI Large Language Models often seem like a panacea, they suffer from a number of key issues.</h4><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2Fp2_U2Nswj8I%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dp2_U2Nswj8I&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2Fp2_U2Nswj8I%2Fhqdefault.jpg&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=youtube" width="854" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/678b1fcc9e969739c112742a887a72f1/href">https://medium.com/media/678b1fcc9e969739c112742a887a72f1/href</a></iframe><ol><li><a href="https://www.youtube.com/watch?v=cfqtFvWOfg0"><strong>Hallucinations</strong></a>: outputs of LLMs that deviate from facts or contextual logic. Models will ‘make stuff up’ if they don’t know an answer. They also suffer from a lack of <strong>contextual understanding</strong>. Hallucinations can range from sentence contradictions, prompt contradictions, factual contradictions or just non-sense/noise. Techniques like <a href="https://www.w3schools.com/gen_ai/chatgpt-4/chatgpt-4_few_shot.php">few-shot</a> prompting and RAG can help.</li><li><strong>Inference Performance</strong>: even the faster models are slower than a dial-up modem, or a fast typist! They also suffer from <strong>latency or </strong>time to first token. For most queries, expect 10–20 second response times from most models, and even with streaming, you’ll end up waiting a few seconds for the first token to be generated!</li><li><strong>Inference Cost: </strong>LLMs are expensive to run! Some of the top 180B parameter models may need as many as 5xA100 GPUs to run, while even quantized versions of 70B LLAMA would take up a whole GPU! That’s one query at a time. The costs add up. For example, a dedicated A100 might cost as much as $20K a month with a cloud provider! A brute force approach is going to be expensive.</li><li><strong>Stale training data:</strong> even top models haven’t been trained on ‘recent’ data, and have a cut-off date. Remember, a model doesn’t ‘have access to the internet’. While certain ‘plugins’ do offer ‘internet search’, it’s just a form of RAG, where ‘top 10 internet search query results’ are fed into the prompt as context, for example.</li><li><strong>Use with private data</strong>: LLMs haven’t been trained on *your* private data, and as such, cannot answer questions based on our dataset, unless that data is inject through fine tuning or prompt engineering.</li><li><strong>Token limits / context window size:</strong> Models are limited by the TOKEN_LIMIT, and most models can process, at best, a few pages of total input/output. You can’t feed a model and entire document, and ask for a summary or extract facts from the document. You need to chunk documents into pages first, and perform multiple queries.</li><li><strong>They only support text:</strong> while this sounds obvious (from the name), it also means you can’t just feed a PDF file or WORD document to a LLM. You first need to convert that data to text, and chunk it to fit in the token limit, alongside your prompt and some room for output. Conversions aren’t perfect. What happens to your images, or tables, or metadata? It also means models can only output text. Formatting the text to output HTML or DOCX or other rich text formats requires a lot of heavy lifting in our pipeline.</li><li><strong>Lack of transparency / explainability:</strong> why did the model generate a particular answer? Techniques such as RAG can help, as you are able to point at the ‘context’ that generated a particular answer, and even display the context. While the LLM answer may not necessarily be correct, you can display the source content that helped generate that answer.</li><li><strong>Potential bias, hate, abuse, harm, ethical concerns, </strong>etc: sometimes, answers generated by an LLM can be outright harmful. Using the RAG pattern, in addition to HARM filters can help mitigate some of these issues. Models are also vulnerable to various forms of <strong>Prompt Hacking / Prompt Injection </strong>where you can trick the model to respond in a way it wasn’t designed to.</li><li><strong>Training and fine tuning costs: </strong>to put it in perspective, <a href="https://arxiv.org/pdf/2307.09288.pdf">a 70B model like LLAMA2</a> might need ~2048 A100 GPUs for a month to train, adding up to $20–40M training cost, not to mention what it takes to download and store the data. The: “Training Hardware &amp; Carbon Footprint” section from the <a href="https://arxiv.org/pdf/2307.09288.pdf">LLAMA2 paper</a> suggests a total of 3311616 GPU hours was used to train LLAMA2 (7/13/34 and 70B)!</li></ol><p><a href="https://www.slideshare.net/cmihai/10-limitations-of-large-language-models-and-mitigation-options">10 Limitations of Large Language Models and Mitigation Options</a></p><p><strong>It helps to think of of Large Language Models (LLMs) like mathematical functions, or your phone’s autocomplete:</strong></p><p><em>f(x) = x’</em></p><ul><li>Where the input (x) and the output (x’) are strings. The model starts by looking at the input, then will ‘autocomplete’ the output.</li><li>For example, f(“What is Kubernetes”) = “Kubernetes, often abbreviated as K8s, is an open-source platform designed to automate deploying, scaling, and operating application containers.”</li><li>Most chat interfaces will also provide a default system prompt. For <a href="https://replicate.com/blog/how-to-prompt-llama">LLAMA2</a>, this is: “You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don&#39;t know the answer to a question, please don&#39;t share false information.&quot;</li><li>Depending on the model and interface, there may be ‘hidden’ inputs to your model. Many Chat interfaces will include a conversational memory, where they insert a moving window of your previous prompts into the current prompt, as context. It would look something like this: “Below are a series of dialogues between a user and a AI assistant…. [dialogues] [new content]“</li></ul><h4>The inputs to a model are a little more complex though:</h4><p><em>f(training_data, model_parameters, input_string) = output_string</em></p><ul><li>training_data represents the data it was trained (different models will provide different answers). While not an ‘input’ as such, the data the model was trained (and how it was trained) on plays a key factor in the output.</li><li><a href="https://huggingface.co/docs/api-inference/detailed_parameters#text-generation-task">model_parameters</a> represent things like “temperature”, “repetition penalty”, “min tokens” or “max tokens”, “top_p”, “top_k”, and other such values.</li><li>input_string is the combination of <strong>prompt</strong> and <strong>context</strong> you give to the model. Ex: “What is Kubernetes” or “Summarize the following document: ${DOCUMENT}”</li><li>the ‘<strong>prompt</strong>’ is usually an optional instruction like “summarize”, “extract”, “translate”, “classify” etc. but more complex prompts are usually used. “Be a helpful assistant that responds to my question.. etc.”</li><li>The function can process a maximum of <strong>TOKEN_LIMIT</strong> (total input and output), usually ~4096 tokens (~3000 words in English, fewer in say.. Japanese). Models with larger TOKEN_LIMITS exist, though they usually don’t perform as well above the 4096 token limit. This means, in practice, you can’t feed a whole whitepaper to an LLM and ask it to ‘summarize this document’, for example.</li></ul><h3>What Large Language Models DON’T DO</h3><p><strong>Learn</strong>: A model will not ‘learn’ from interactions (unless specifically trained/fine-tuned).</p><p><strong>Remember</strong>: A model doesn’t remember previous prompts. In fact, it’s all done with prompt trickery: previous prompts are injected. The API does a LOT of of filtering and heavy lifting!</p><p><strong>Reason:</strong> Think of LLMs like your phone’s autocomplete, it doesn’t reason, or do math.</p><p><strong>Use your data:</strong> LLMs don’t provide responses based on YOUR data (databases or files), unless it’s include in the training dataset, or the prompt (ex: RAG).</p><p><strong>Use the Internet:</strong> A LLM doesn’t have the capacity to ‘search the internet’, or make API calls.</p><ul><li>In fact, a model does not perform <em>any</em> activity other than converting one string of text into another string of text.</li><li>Any 3rd party data not in the model will need to be injected into prompts (RAG)</li></ul><p><strong>Cite Sources:</strong> models may ‘appear’ to cite sources, but that’s not reliable. It’s more likely to be a hallucination. Asking a model what content generated an output isn’t going to provide a relevant response. Retrieval Augmented Generation can help provide attribution / grounding though.</p><p><strong>Adding an LLM to your software architecture:</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*aUD4euAjDnRreb9hTcjghA.png" /><figcaption>A LLM is much much slower than a Faxmodem!</figcaption></figure><p>Believe it or not, LLMs are <strong>much slower </strong>than even a faxmodem! At WPM = ((BPS / 10) / 5) * 60, a 9600 baud modem will generate 11520 words / minute.</p><p>At an average 30 tokens / second (20 words) for LLAMA-70B, you’re getting 1200 words / minute!</p><p>Large models (70B) such as LLAMA2 can be painfully slow. Smaller models (20B, 13B, 7B) are faster, and require less GPU to run. Quantized models are also faster, but provide lower quality responses.</p><h4>Quantize your model for faster inference</h4><p>You can <a href="https://huggingface.co/docs/transformers/main/main_classes/quantization">load and quantize your model</a> in 8, 4, 3 or even 2 bits, sacrificing quality for faster inference speed.</p><p>This is always a trade-off, as you’re sacrificing model output quality for faster inferencing. Since a quantized model needs less GPU VRAM to run in, this helps you run large models on commodity hardware.</p><h4>Why do models hallucinate?</h4><ol><li><strong>Data Quality:</strong> The model itself has been trained on biased, noisy, old, low quality or incorrect data. For example, models trained on forums and other such data.</li><li><strong>Generation Method:</strong> Models and their weights might be biased towards specific languages, words or data</li><li><strong>Lack of context or contextual understanding:</strong> The input prompt is contradictory, or unclear. The prompt does not provide sufficient examples of the desired output. The model lacks context to respond to the input, either in it’s dataset or the prompt. This is within the control of the user, and can be improved through prompt engineering.</li></ol><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FcfqtFvWOfg0%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcfqtFvWOfg0&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FcfqtFvWOfg0%2Fhqdefault.jpg&amp;type=text%2Fhtml&amp;schema=youtube" width="854" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/e593888bbaeb099ddede40f41c3fcfdf/href">https://medium.com/media/e593888bbaeb099ddede40f41c3fcfdf/href</a></iframe><h4>Reducing model hallucinations:</h4><p>LLMs lack context from private data — leading to hallucinations when asked domain or company-specific questions. RAG can help reduce hallucinations by ‘injecting’ context into prompts.</p><p>Workarounds also include advanced prompt engineering, adding a prompt such as: If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct, or providing examples through one-shot or few-shot prompting.</p><p>Other techniques include activate mitigations, such as controlling the parameters of the model, such as <strong>“temperature”,</strong> which controls how ‘creative’ the response is.</p><p>Another approach is <a href="https://huggingface.co/papers/2309.11495">Chain-of-Verification (CoVe)</a> — where the a LLM generates verification questions to fact check the initial response, then answers them, then verifies the response against the initial response. As you can imagine, these are very ‘costly’ operations from a computational / time perspective, but may be required where accuracy is paramount.</p><p><strong>Papers:</strong></p><ul><li><a href="https://www.pinecone.io/learn/retrieval-augmented-generation/">Retrieval Augmented Generation as a mechanism to reduce hallucinations</a></li><li><a href="https://arxiv.org/abs/2104.07567">Retrieval Augmentation Reduces Hallucination in Conversation</a></li><li><a href="https://arxiv.org/abs/2308.06394">Detecting and Preventing Hallucinations in Large Vision Language Models</a></li><li><a href="https://arxiv.org/abs/2201.11903">Chain-of-Thought Prompting Elicits Reasoning in Large Language Models</a></li><li><a href="https://arxiv.org/abs/2305.10601">Tree of Thoughts: Deliberate Problem Solving with Large Language Models</a></li><li><a href="https://arxiv.org/abs/2308.11764">Halo: Estimation and Reduction of Hallucinations in Open-Source Weak Large Language Models</a></li></ul><h3>Retrieval Augmented Generation and the importance of Vector Databases</h3><p>A vector database is a specialized database designed to store and query vector embeddings efficiently. Vector embeddings are numerical representations of text, images, audio, or other data. They are used in a variety of machine learning applications, such as natural language processing, image recognition, and recommendation systems.</p><p><strong>Near Vector search or how to Search for “Sky” and find “Blue”:</strong></p><ul><li>Finding the most similar documents to a given document</li><li>Finding documents that contain a specific keyword or phrase</li><li>Clustering documents together based on their similarity</li><li>Ranking documents for a search query</li></ul><p>Popular vector databases include ChormaDB, Weaviate, Milvus.</p><p><strong>Advantages of using a VectorDB with your LLM, in a Retrieval Augmented Generation Pattern:</strong></p><ul><li>Insert your data into prompts every time</li><li>Cheap, and can work with vast amounts of data</li><li>While LLMs are SLOW, Vector Databases are FAST!</li><li>Can help overcome model limitations (such as token limits) — as you’re only feeding ‘top search results’ to the LLM, instead of whole documents.</li><li>Reduce hallucinations by providing context.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1009/1*kzn4Bat9LXgoE_cmYpjmHg.png" /></figure><h4><strong>Loading Documents into your Vector Databases:</strong></h4><p>Loading data into your vector database typically requires you to convert documents to text, split the text into chunks, then vectorize those chunks using an embedding model. <a href="https://www.sbert.net/">SentenceTransformers</a> offers a number of pre-trained models, such as <strong>all-mpnet-base-v2</strong> or<strong> all-MiniLM-L12-v2</strong> that perform well for English text.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/893/1*dlyUqmzR11QOgmr4DQAvWg.png" /></figure><h3>Scaling factor for RAG: what to consider:</h3><ul><li>Vector Database: consider sharding and High Availability</li><li>Fine Tuning: collecting data to be used for fine tuning</li><li>Governance and Model Benchmarking: how are you testing your model performance over time, with different prompts, one-shot, and various parameters</li><li>Chain of Reasoning and Agents</li><li>Caching embeddings and responses</li><li>Personalization and Conversational Memory Database</li><li>Streaming Responses and optimizing performance. A fine tuned 13B model may perform better than a poor 70B one!</li><li>Calling 3rd party functions or APIs for reasoning or other type of data (ex: LLMs are terrible at reasoning and prediction, consider calling other models)</li><li>Fallback techniques: fallback to a different model, or default answers</li><li>API scaling techniques, rate limiting, etc.</li><li>Async, streaming and parallelization, multiprocessing, GPU acceleration (including embeddings), generating your API using OpenAPI, etc.</li><li>Retraining your embedding model</li></ul><h3>RAG Talk from Shipitcon can be found on GitHub and YouTube:</h3><ul><li><a href="https://github.com/crivetimihai/shipitcon-scaling-retrieval-augmented-generation">https://github.com/crivetimihai/shipitcon-scaling-retrieval-augmented-generation</a></li><li><a href="https://www.youtube.com/watch?v=lL4DPcxljH8">https://www.youtube.com/watch?v=lL4DPcxljH8</a></li></ul><h3>Social media</h3><ul><li><a href="https://twitter.com/CrivetiMihai">https://twitter.com/CrivetiMihai</a> — follow for more LLM content</li><li><a href="https://youtube.com/CrivetiMihai">https://youtube.com/CrivetiMihai</a> — more LLM videos to follow</li><li><a href="https://www.linkedin.com/in/crivetimihai/">https://www.linkedin.com/in/crivetimihai/</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=d019e394eeb7" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Building Data Science Environments]]></title>
            <link>https://medium.com/@crivetimihai/building-data-science-environments-9e770f6bf2c1?source=rss-7648462e917d------2</link>
            <guid isPermaLink="false">https://medium.com/p/9e770f6bf2c1</guid>
            <category><![CDATA[mainframe]]></category>
            <category><![CDATA[jupyter]]></category>
            <category><![CDATA[docker]]></category>
            <category><![CDATA[ibm]]></category>
            <category><![CDATA[python]]></category>
            <dc:creator><![CDATA[Mihai Criveti]]></dc:creator>
            <pubDate>Fri, 04 Aug 2023 16:47:49 GMT</pubDate>
            <atom:updated>2023-08-04T16:47:49.427Z</atom:updated>
            <content:encoded><![CDATA[<h3>Data Science on the Mainframe</h3><h4>Docker, Python and Jupyter Notebook on zLinux.</h4><p>Data Science environments with Docker Compose running Jupyter Lab, PostgreSQL, PGAdmin, Superset, Grafana and Traefik — running on zLinux on the IBM Mainframe.</p><p>Here, I’m using Red Hat Enterprise Linux 7.5 to build and deploy Jupyter notebook in an Ubuntu container, create a Redis Alpine container and a postgresql container — then link them using docker-compose.</p><p>Oh, and in case you’re wondering: why would anyone do this — check out this snippet from the <a href="https://developer.ibm.com/mainframe/2017/07/17/ibm-z-software-z14-announcement/">z14 announcement:</a> “Microservices can be built on z14 with Node.js, Java, Go, Swift, Python, Scala, Groovy, Kotlin, Ruby, COBOL, PL/I, and more. They can be deployed in Docker containers where a single z14 can scale out to 2 million Docker containers”.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*AC3UlqJls8TL3FFs" /></figure><h3>A few basic commands:</h3><p>Establish the OS release and version. We’re running on RHEL 7.5 for s390x.</p><pre>[cmihai@rh74s390x ~]$ cat /etc/redhat-release<br>Red Hat Enterprise Linux Server release 7.5 (Maipo)</pre><pre>[cmihai@rh74s390x ~]$ uname -a<br>Linux rh74s390x.novalocal 3.10.0-693.17.1.el7.s390x #1 SMP Sun Jan 14 10:38:29 EST 2018 s390x s390x s390x GNU/Linux</pre><pre>[cmihai@rh74s390x ~]$ docker --version<br>Docker version 17.05.0-ce, build 89658be</pre><h3>Install or Upgrade Docker:</h3><pre>$ sudo ./icp-docker-18.03.1_s390x.bin --upgrade</pre><pre>$ /usr/bin/docker --version<br>Docker version 18.03.1-ce, build 9ee9f40</pre><h3>Setup regular user access, sudo and SSH keys</h3><h3>Create a regular user account</h3><pre>useradd cmihai<br>passwd cmihai<br>usermod -aG wheel cmihai<br>su - cmihai</pre><h3>Add your SSH public key to authorized_keys</h3><h3>Log in as your new user, and forward port 9000:</h3><h3>Setup docker</h3><h3>Create the Docker group</h3><h3>Start Docker</h3><h3>Test docker</h3><h3>Let’s run a simple Ubuntu interactive shell:</h3><h3>Building a Docker container for Jupyter Notebook</h3><p>Create a file <em>Dockerfile_jupyter</em> from the s390x/ubuntu base image.</p><h3>Build the container:</h3><h3>Run your new container:</h3><p>Connect to Jupyter Notebook</p><p><a href="http://localhost:9000/">http://localhost:9000</a></p><h3>You can now install depedencies directly from Jupyter:</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*S4iTppVVkywkV2sT" /></figure><h3>Setup docker-compose:</h3><p>Setup Docker Compose and build multi-tiered applications specifications — such as connecting your Jupyter Notebook to PostgreSQL, Redis, Spark, etc.</p><h3>Setup and link Juypter to a database container (Redis):</h3><h3>Create a Redis container image:</h3><p>Create a `Dockerfile_redis` from the alpine image, a light weight Linux distribution:</p><p>Then build the container</p><h3>Setup docker-compose to link Jupyter to Redis:</h3><p>Create a file<em> compose-redis-jupyter.yaml</em></p><p>Start your container</p><h3>Connect Jupyter to Redis with Python code:</h3><p>In Jupyter Notebook:</p><p>You now have access to Jupyter Notebook with a Python 3 kernel and key-value pair store (Redis).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*rkmhTpdGNal1aJ-o" /></figure><h3>Setup Postgresql:</h3><p>Many of the official images run on multiple architectures. This includes the postgres image, which will run on s390x. Let’s connect Jupyter Notebook to postgres, using docker-compose:</p><p>In Jupyter, we can no access postgres, as show below:</p><p>Jupyter Notebook will connect to the postgres database:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*_af6bYUk4lvw7_m3" /></figure><p>Potential next steps:</p><ul><li>Set up other programming languages or kernels (Java, R) even Zeppelin Notebook</li><li>Setup Spark or Kafka</li></ul><p>For an interactive tutorial of using Docker for Data Science, check out: <a href="https://github.com/crivetimihai/docker-data-science">https://github.com/crivetimihai/docker-data-science</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9e770f6bf2c1" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Manage Containers with Podman, Skopeo and Buildah]]></title>
            <link>https://medium.com/@crivetimihai/manage-containers-with-podman-skopeo-and-buildah-4ef228bea87?source=rss-7648462e917d------2</link>
            <guid isPermaLink="false">https://medium.com/p/4ef228bea87</guid>
            <category><![CDATA[buildah]]></category>
            <category><![CDATA[skopeo]]></category>
            <category><![CDATA[containers]]></category>
            <category><![CDATA[podman]]></category>
            <category><![CDATA[red-hat]]></category>
            <dc:creator><![CDATA[Mihai Criveti]]></dc:creator>
            <pubDate>Fri, 04 Aug 2023 16:42:56 GMT</pubDate>
            <atom:updated>2023-08-04T16:42:56.995Z</atom:updated>
            <content:encoded><![CDATA[<h3>Part 1</h3><ul><li>Install podman, buildah and skopeo</li></ul><h3>Part 2</h3><ul><li>Publish images to external registries</li><li>Quay.io and Clair</li></ul><h3>Part 3</h3><ol><li>Install CodeReady Containers, Create a project called wordpress</li><li>Create users and groups and setup htpassword authentication</li><li>Deploy mysql from registry.redhat.io/rhel8/mysql-80 and configure the secret</li><li>Deploy wordpress from docker.io/library/wordpress:5.5.0-php7.2-apache</li><li>Create a route and test wordpress, scale out</li></ol><h3>What is Podman?</h3><p>podman — manage pods, containers and OCI compliant container images</p><h3>How is Podman different?</h3><ul><li>Can be run as a regular user without requiring root.</li><li>Can manage <strong>pods</strong> (groups of one or more containers that operate together).</li><li>Lets you import Kubernetes definitions using podman play.</li><li>Fork-exec model instead of client-server model (containers are child processes of podman).</li><li>Compatible with Docker, Docker Hub or any OCI compliant container implementation.</li></ul><p><a href="https://www.redhat.com/en/blog/why-red-hat-investing-cri-o-and-podman">https://www.redhat.com/en/blog/why-red-hat-investing-cri-o-and-podman</a> <a href="https://developers.redhat.com/blog/2019/02/21/podman-and-buildah-for-docker-users/">https://developers.redhat.com/blog/2019/02/21/podman-and-buildah-for-docker-users/</a></p><h3>What is Buildah?</h3><p>buildah — build container images from CLI or Dockerfiles</p><h3>How is Buildah different?</h3><ul><li>Containers can be build using simple CLI commands or shell scripts instead of Dockerfiles.</li><li>Images can then be pushed to any container registry and can be used by any container engine, including Podman, CRI-O, and Docker.</li><li>Buildah is also often used to securely build containers while running inside of a locked down container by a tool like Podman, OpenShift/Kubernetes or Docker.</li></ul><h3>What is Skopeo?</h3><p>skopeo — inspect and copy containers and images between different storage</p><h3>How does Skopeo help?</h3><ul><li>It can copy images to and from a host, as well as to other container environments and registries.</li><li>Skopeo can inspect images from container image registries, get images and image layers, and use signatures to create and verify images.</li></ul><h3>Red Hat Image Sources Explained</h3><h3>Red Hat Software Collections Library (RHSCL)</h3><ul><li>For developers that need the latest versions of tools not in the RHEL release schedule.</li><li>Use the latest development tools without impacting RHEL.</li><li>Available to all RHEL subscribers.</li></ul><h3>Red Hat Container Catalog (RHCC)</h3><ul><li>Certified, curated and texted images built on RHEL.</li><li>Images have gone through a QA process.</li><li>Upgraded on a regular bases to avoid security vulnerabilities.</li></ul><h3>Quay.io</h3><ul><li>Public / private container repository.</li></ul><p><a href="https://github.com/sclorg?q=-container">https://github.com/sclorg?q=-container</a> for Dockerfiles</p><h3>Universal Base Image — UBI</h3><h3>Red Hat Universal Base Image — UBI</h3><p>UBI — Freely distributable OCI compliant secure container base images based on RHEL</p><h3>How does UBI Help?</h3><ul><li>More than just a base image, UBI provides three base images across RHEL 7 and RHEL 8: ubi, ubi-minimal and ubi-init</li><li>And a set of language runtimes (ex: nodejs, ruby, python, php, perl, etc.)</li><li>All packages in UBI come from RHEL channels and are supported on RHEL and OpenShift.</li><li>Secure by default, maintained and supported by Red Hat.</li></ul><h3>The Red Hat Container Catalog</h3><h3>Certified container images from Red Hat and 3rd party vendors</h3><p>Container Images with a Container Health Index</p><h3>Pulling a container image</h3><p>podman pull registry.access.redhat.com/ubi8/python-38</p><h3>Podman Compose</h3><h3>What is podman-compose?</h3><ul><li>An implementation of docker-compose with Podman backend.</li></ul><h3>Why podman-compose and when to use it?</h3><ul><li>run unmodified docker-compose.yaml files, rootless</li><li>no daemon or setup required</li><li>Only depends on podman, Python 3 and PyYAML.</li></ul><h3>When NOT to use podman-compose?</h3><ul><li>When you can use podman pod or podman generate and podman play` instead to create pods or import Kubernetes definitions.</li><li>For single-machine development, consider <a href="https://developers.redhat.com/products/codeready-containers/overview">CodeReady Containers</a></li><li>For multi-node clusters, check out Red Hat OpenShift, Kubernetes or <a href="https://www.okd.io/minishift/">OKD</a>.</li></ul><p><a href="https://developers.redhat.com/blog/2019/01/15/podman-managing-containers-pods/">https://developers.redhat.com/blog/2019/01/15/podman-managing-containers-pods/</a></p><h3>Install podman, buildah and skopeo</h3><h3>Fedora 32 / RHEL 8</h3><p># Install podman, buildah and skopeo on Fedora 32<br>sudo dnf -y install podman buildah skopeo slirp4netns fuse-overlayfs</p><h3>Ubuntu / Debian</h3><p>sudo apt update &amp;&amp; sudo apt -y install podman buildah skopeo</p><h3>Getting help</h3><p>podman version<br>podman — help # list available commands<br>man podman-ps # or commands like run, rm, rmi, image, build<br>podman info # display podman system information</p><p><a href="https://podman.io/getting-started/installation">https://podman.io/getting-started/installation</a></p><p>slirp4netns is used to connect a network namespace to the internet in a rootless way.</p><h3>Rootless Containers and cgroup v2</h3><h3>Note that our regular user has UID 1000</h3><p>uid=1000(cmihai) gid=1000(cmihai) groups=1000(cmihai)</p><h3>What are UIDs mapped to inside the container?</h3><p>podman unshare cat /proc/self/uid_map</p><p>0 1000 1<br>1 100000 65536</p><blockquote><em>UID 0 is mapped my UID (1000). UID 1 is mapped to 100000, UID 2 would map to 100001, etc. That means that a container UID of 27 would map to UID 1000026.</em></blockquote><h3>Let’s test this</h3><p>mkdir test &amp;&amp; podman unshare chown 27:27 test<br>ls -ld test</p><p>drwxrwxr-x. 2 100026 100026 4096 Sep 27 09:38 test</p><p><a href="https://developers.redhat.com/blog/2020/09/25/rootless-containers-with-podman-the-basics/">https://developers.redhat.com/blog/2020/09/25/rootless-containers-with-podman-the-basics/</a> <a href="https://podman.io/blogs/2019/10/29/podman-crun-f31.html">https://podman.io/blogs/2019/10/29/podman-crun-f31.html</a></p><h3>Running Containers with Podman</h3><h3>Searching for Images with podman search</h3><h3>Configure search sources</h3><p>grep search /etc/containers/registries.conf</p><p>unqualified-search-registries =<br>[‘registry.fedoraproject.org’, ‘registry.access.redhat.com’,<br>‘registry.centos.org’, ‘docker.io’]</p><h3>Searching for images (with filters)</h3><p>podman search httpd — filter=is-official</p><p>INDEX NAME DESCRIPTION STARS OFFICIAL<br>docker.io docker.io/library/httpd The Apache HTTP Server 3181 [OK]</p><p>podman can be configured to search multiple private or public container registries for images.</p><h3>Adding a local registry configuration</h3><h3>Create a configuration file</h3><p>mkdir -p ~/.config/containers</p><h3>Add public and private registries in search order</h3><p>vim $HOME/.config/containers/registries.conf</p><p>[registries.search]<br>registries = [&quot;registry.access.redhat.com&quot;, &quot;quay.io&quot;, &quot;docker.io&quot;]</p><p>[registries.insecure]<br>registries = [&#39;localhost:5000&#39;]</p><h3>Inspecting and pulling images</h3><h3>Inspecting Images with skopeo (ex: listing tags)</h3><p>skopeo inspect docker://docker.io/library/httpd</p><h3>Inspect the image with podman and show image history</h3><p>podman inspect httpd:2.4.46-alpine<br>podman history httpd:2.4.46-alpine</p><h3>Pulling Images locally with podman pull</h3><p>podman pull docker.io/library/httpd:2.4.46-alpine<br>podman images</p><p><a href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/managing_containers/finding_running_and_building_containers_with_podman_skopeo_and_buildah">https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/managing_containers/finding_running_and_building_containers_with_podman_skopeo_and_buildah</a> Skopeo lets you inspect and copy images between registries.</p><h3>Running containers in interactive mode</h3><h3>Run an interactive session</h3><p>podman run — name ubuntu — hostname ubuntu \<br> — interactive — tty ubuntu /bin/bash</p><h3>Reattach</h3><p>podman start — attach — interactive ubuntu</p><h3>Delete your container on exit</h3><p>podman run — rm — name ubuntu — hostname ubuntu \<br> — interactive — tty ubuntu /bin/bash</p><p>At this point, you could save your modified container and load it into a registry, but there’s a better way: build your container using a Dockerfile or buildah.</p><h3>Running Containers in the background</h3><h3>Run a sample httpd container to serve a webpage</h3><p># Run a container in the background, bind port 8080<br>podman run — name httpd — detach \<br> — publish 8080:8080/tcp \<br>registry.fedoraproject.org/f29/httpd</p><ul><li>We’ve named the container httpd to make it easier to access later.</li><li>Port 8080 inside the container is redirected to 8080 on the host.</li><li>Notice that we’re using an image that binds to a non-privileged (8080) port.</li></ul><h3>Test the webpage</h3><p>curl localhost:8080</p><h3>Check the process</h3><p>ps -ef | grep podman</p><h3>Check container status and logs</h3><h3>Check the container status and logs</h3><p># List the running containers<br>podman ps</p><p># Inspect the (last) ran container — check the Env and IP sections<br>podman inspect -l</p><p># Check the container logs<br>podman logs httpd</p><p>The Env section is especially useful to identify env. variables used to start the container.</p><h3>Starting and stopping containers</h3><h3>Stop the container and check the status</h3><p>podman stop httpd<br>podman ps -a</p><p>IMAGE CREATED STATUS<br>httpd:latest 5 minutes ago Exited (0) 2 seconds ago</p><h3>Start the container back</h3><p>podman start httpd<br>podman ps<br>CREATED STATUS PORTS NAMES<br>7 minutes ago Up 13 seconds ago 0.0.0.0:8080-&gt;8080/tcp httpd</p><h3>Using Environment Variables</h3><h3>Search and inspect an image</h3><p>podman search mysql-57-rhel7<br>skopeo inspect \<br>docker://registry.access.redhat.com/rhscl/mysql-57-rhel7 \<br>| grep usage</p><h3>Deploy MySQL</h3><p>podman run — name mysql \<br>-e MYSQL_USER=user -e MYSQL_PASSWORD=password \<br>-e MYSQL_DATABASE=mydb -e MYSQL_ROOT_PASSWORD=password \<br> — detach rhscl/mysql-57-rhel7:latest</p><h3>Check the logs</h3><p>podman logs mysql</p><h3>Executing a command in a running container with podman-exec</h3><h3>Inspect the environment (different methods show)</h3><p>podman inspect -f ‘{{ .NetworkSettings.IPAddress }}’ mysql<br>podman exec mysql env</p><h3>Execute a shell inside the mysqld container</h3><p>podman exec -it mysql bash<br>mysql -uroot<br>show databases;<br>use mydb;</p><p>exit</p><h3>Execute a command</h3><p>podman exec -it mysql \<br>/opt/rh/rh-mysql57/root/usr/bin/mysql \<br>-uuser -ppassword -e ‘show databases;’</p><p>podman exec -it mysql /bin/bash mysql -u root show databases;</p><p>Note that rootless podman won’t show an IPAddress <a href="https://podman.io/getting-started/network.html">https://podman.io/getting-started/network.html</a> — run podman with sudo to obtain IP.</p><h3>Container Resources</h3><h3>Check processes inside container</h3><p>podman top -l</p><p>USER PID PPID %CPU ELAPSED TTY TIME COMMAND<br>default 1 0 0.000 2m4.13954806s ? 0s httpd -D FOREGROUND<br>default 18 1 0.000 2m4.139682033s ? 0s /usr/bin/coreutils</p><h3>Display live stream of resource usage statistics</h3><p>podman stats</p><p>ID NAME CPU % MEM USAGE / LIMIT MEM % NET IO BLOCK IO PIDS<br>00b65 httpd 0.07% 40.91MB / 67.4GB 0.06% — / — — / — 217</p><h3>Check published ports</h3><p>podman port -l</p><p>8080/tcp -&gt; 0.0.0.0:8080</p><h3>Commiting, Saving and Loading Images</h3><h3>Create an image</h3><p>podman run — name ubuntu-apache2 — hostname ubuntu-apache2 \<br> — interactive — tty ubuntu:20.04 /bin/bash</p><p># Install Apache HTTPD<br>apt update &amp;&amp; apt install -y apache2 &amp;&amp; exit</p><h3>List changed files (A — added, C — changed, D — deleted)</h3><p>podman diff ubuntu-apache2</p><h3>Commit image from container with entrypoint and label</h3><p>podman commit — change CMD=/bin/bash — change ENTRYPOINT=/bin/sh \<br> — change “LABEL author=cmihai” ubuntu ubuntu-apache2</p><h3>Save an image</h3><p>podman save -o ubuntu-apache2.tar ubuntu-apache2</p><h3>Load an image</h3><p>podman load -i ubuntu-apache2.tar ubuntu-apache2</p><p>podman commit creates an image based on a changed container.</p><h3>Modifying the Apache image port form 80 to 8080</h3><blockquote><em>Use podman commit to create a custom Apache HTTPD image that listens on port 8080.</em></blockquote><h3>Search for the official image and run it</h3><p>podman search httpd — filter=is-official<br>podman run -it — name httpd-docker docker.io/library/httpd:2.4 /bin/bash</p><h3>Change the port name</h3><p>sed -i ‘s/Listen 80/Listen 8080/g’ /usr/local/apache2/conf/httpd.conf<br>exit</p><h3>Same to a new image</h3><p>podman stop httpd-docker<br>podman diff httpd-docker<br>podman commit -a ‘Mihai’ httpd-docker cmihai/httpd:2.4<br>podman images | grep cmihai/httpd<br>podman rm httpd-docker</p><h3>Test the new image</h3><p>podman run — detach — publish 8080:8080 — name httpd-cmihai docker.io/library/httpd:2.4<br>curl localhost:8080</p><h3>Tagging or Removing Tags from Images</h3><h3>Tag and image</h3><p>podman tag ubuntu-apache2 cmihai/apache2<br>podman tag ubuntu-apache2 cmihai/apache2:latest</p><h3>Remove a tag from an image</h3><p>podman rmi cmihai/apache2:latest</p><p>Multiple tags can point to the same image.</p><h3>Pusing an image to a Registry</h3><h3>Tag the image in your local repository</h3><p>podman tag ubuntu-apache2 quay.io/apache2:latest</p><h3>Push to quay.io</h3><p>podman push quay.io/cmihai/apache2:latest</p><h3>Example build and push</h3><p>podman login quay.io<br>podman build --layers=false -t cmihai/jupyterlab:python38 .<br>podman tag localhost/cmihai/jupyterlab:python38 \<br>quay.io/cmihai/jupyterlab:latest<br>podman push quay.io/cmihai/jupyterlab:latest</p><p>Podman will automatically add the latest tag if you do not specify a tag!</p><h3>Volumes</h3><h3>Create a volume directory on the host and provide permissions</h3><p>mkdir myvol<br>podman unshare chown 999:999 myvol</p><p><a href="https://www.redhat.com/sysadmin/user-namespaces-selinux-rootless-containers">https://www.redhat.com/sysadmin/user-namespaces-selinux-rootless-containers</a></p><h3>Create a container and attach a volume to /data as rw</h3><p>podman run — rm — name ubuntu \<br> — volume ./myvol:/data:Z \<br> — interactive — tty ubuntu /bin/bash</p><ul><li>-Z tells podman to relabel the volume’s content to match the label inside the container</li><li>podman unshare runs a command inside a modified user namespace.</li></ul><h3>SELinux Permissions — Manual approach without using unshare</h3><h3>Let’s check what a MySQL container runs as</h3><p>podman run -ti rhscl/mysql-57-rhel7:latest grep mysql /etc/passwd</p><p>mysql:x:27:27:MySQL Server:/var/lib/mysql:/sbin/nologin</p><h3>Create a directory with owner and group root and give matching permissions</h3><p>sudo mkdir mysql-data &amp;&amp; sudo chown -R 27:27 mysql-data</p><h3>Apply the SELinux container_file_t context and policy</h3><p>sudo semanage fcontext -a -t container_file_t ‘./mysql-data(/.*)?’<br>sudo restorecon -Rv ./mysql-data</p><h3>Running MySQL with a host directory volume</h3><h3>Start MySQL</h3><p>podman run — name mysql \<br>-v ./mysql-data:/var/lib/mysql/data:Z \<br>-e MYSQL_USER=user -e MYSQL_PASSWORD=password \<br>-e MYSQL_DATABASE=mydb -e MYSQL_ROOT_PASSWORD=password \<br> — detach rhscl/mysql-57-rhel7:latest</p><h3>Troubleshoot logs and permissions</h3><p>podman logs mysql; sudo find ./mysql-data; ls -dnZ mysql-data</p><p>drwxrwxr-x. 6 100026 100026 system_u:object_r:container_file_t:s0:c303,c890<br>4096 Sep 27 09:09 mysql-data</p><h3>Check out the permissions inside the container</h3><p>podman exec mysql ls -lanZ /var/lib/mysql/data</p><p>drwxrwxr-x. 27 27 system_u:object_r:container_file_t:s0:c303,c890 .</p><h3>You could just use:</h3><p>mkdir mysql-data<br>podman unshare chown 27:27 mysql-data<br>ls -dZ mysql-data</p><p>system_u:object_r:container_file_t:s0:c303,c890 mysql-data</p><h3>Volumes — MariaDB</h3><p>Let’s try this one more time with mariadb/server</p><h3>Network</h3><h3>Linking Containers</h3><h3>Pods</h3><h3>Cleanup</h3><h3>Stop and remove the container</h3><p>podman stop httpd<br>podman rm httpd<br>podman ps -a # Check that the containers are deleted</p><h3>Removing the container image</h3><p>podman rmi registry.fedoraproject.org/f29/httpd<br>podman images # List images</p><h3>Delete everything (stopped containers, pods, dangling images and build cache)</h3><p>podman system prune</p><h3>Setting a container to start at boot using systemd</h3><h3>Enable SELinux and start your container</h3><p>setsebool -P container_manage_cgroup on<br>sudo podman run -d — name redis_server -p 6379:6379 redis</p><h3>vim /etc/systemd/system/redis-container.service</h3><p>[Unit]<br>Description=Redis container</p><p>[Service]<br>Restart=always<br>ExecStart=/usr/bin/podman start -a redis_server<br>ExecStop=/usr/bin/podman stop -t 2 redis_server</p><p>[Install]<br>WantedBy=local.target</p><h3>Enable the service</h3><p>systemctl enable redis-container.service<br>systemctl start redis-container.service<br>systemctl stop redis-container.service<br>systemctl restart redis-container.service<br>systemctl status redis-container.service</p><p><a href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/managing_containers/running_containers_as_systemd_services_with_podman">https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/managing_containers/running_containers_as_systemd_services_with_podman</a></p><h3>Creating a Wordpress + MariaDB Server pod</h3><h3>Create a volume and apply permissions</h3><p>mkdir mariadb-data &amp;&amp; podman unshare chown 999:999 mariadb-data</p><h3>Create a pod</h3><p>podman pod create — publish 127.0.0.1:8888:80 — name wordpress-pod</p><h3>Start a pod to get info</h3><p>podman run — rm mariadb:latest — verbose — help</p><h3>Start MariaDB with a custom configuration and data volume</h3><p>podman run — name mariadb — pod wordpress-pod \<br>-v ./mariadb-data:/var/lib/mysql:Z \<br>-e MYSQL_USER=wpsuser -e MYSQL_PASSWORD=password \<br>-e MYSQL_DATABASE=wordpress -e MYSQL_ROOT_PASSWORD=password \<br> — detach mariadb:10.5.5</p><h3>Start wordpress</h3><p>podman run — name wordpress — pod wordpress-pod \<br>-e WORDPRESS_DB_HOST=127.0.0.1:3306 \<br>-e WORDPRESS_DB_USER=wpsuser -e WORDPRESS_DB_PASSWORD=password \<br>-e WORDPRESS_DB_NAME=wordpress \<br> — detach wordpress:latest</p><h3>Building Containers with Podman from a Containerfile</h3><h3>Building container images from Containerfile with podman</h3><h3>Create a Containerfile for Jupyter Lab starting from a UBI</h3><p>FROM registry.access.redhat.com/ubi8/python-38</p><p>RUN pip install — upgrade — no-cache-dir jupyterlab</p><p>EXPOSE 8888<br>CMD [ “jupyter”,”lab”,” — ip=0.0.0.0&quot; ]</p><h3>Build the container image</h3><p>podman build — layers=false -t cmihai/jupyterlab .</p><h3>Test the image</h3><p>podman run — name jupyterlab — detach — publish 8888:8888/tcp cmihai/jupyterlab<br>podman logs jupyterlab # Retrieve the token to log in</p><h3>Building Containers with Buildah</h3><h3>Building container images with buildah</h3><p>container=$(buildah from fedora)<br>buildah run ${container} dnf install -y texlive<br>wget https://github.com/jgm/pandoc/releases/download/2.9.2/pandoc-2.9.2-linux-amd64.tar.gz<br>tar zxf pandoc-2.9.2-linux-amd64.tar.gz<br>buildah copy ${container} pandoc-2.9.2/bin /usr/local/bin<br>buildah commit ${container} cmihai/pandoc<br>buildah images<br>buildah inspect ${container}<br>podman run cmihai/pandoc pandoc --version</p><h3>Pushing images to a container registry with Quay</h3><h3>Logging into Quay</h3><h3>After creating a quay.io account and password, login using podman</h3><p>podman login quay.io</p><p>Username: cmihai<br>Password: ( password here)</p><h3>Creating a new container</h3><h3>Create a new container image based on UBI8 Minimal</h3><p>podman run — name ubi8-httpd \<br>-it registry.access.redhat.com/ubi8/ubi-minimal:latest \<br>/bin/bash</p><p>microdnf update -y<br>microdnf -y install httpd<br>microdnf clean all<br>rm -rf /var/cache/yum</p><h3>Commit the container image</h3><h3>Get the container ID (or name) and commit the changes to an image</h3><p>podman ps -l<br>podman commit ubi8-httpd quay.io/cmihai/ubi8-httpd:latest</p><h3>Check that the image is there</h3><p>podman images | grep cmihai/ubi8-httpd</p><p>quay.io/cmihai/ubi8-httpd latest 8535a6affc3e 15 seconds ago 209 MB</p><h3>Building the same image using a Dockerfile</h3><p>FROM registry.access.redhat.com/ubi8/ubi-minimal<br>USER root<br>LABEL maintainer=”Mihai Criveti”</p><p># Update image<br>RUN microdnf update -y &amp;&amp; microdnf install -y httpd \<br>&amp;&amp; microdnf clean all &amp;&amp; rm -rf /var/cache/yum \<br>&amp;&amp; echo “Apache” &gt; /var/www/html/index.html</p><p># Port<br>EXPOSE 80</p><p># Start the service<br>CMD [“-D”, “FOREGROUND”]<br>ENTRYPOINT [“/usr/sbin/httpd”]</p><h3>Build and tag the image</h3><p>podman build . -t cmihai/ubi8-httpd:latest<br>podman tag localhost/cmihai/ubi8-httpd quay.io/cmihai/ubi8-httpd:latest</p><h3>Test the image</h3><h3>Run the image</h3><p>podman run — detach — name httpd — publish 8080:80 quay.io/cmihai/ubi8-httpd:latest</p><h3>Check logs and server status</h3><p>podman logs httpd<br>podman port -l<br>curl localhost:8080</p><h3>Push the image to quay</h3><h3>Push the local tagged image to quay.io</h3><p>podman push quay.io/cmihai/ubi8-httpd</p><h3>Check the image on quay.io</h3><h3>Check that the image is there</h3><p>podman pull <a href="https://quay.io/repository/cmihai/ubi8-httpd">https://quay.io/repository/cmihai/ubi8-httpd</a></p><h3>Visit the image page:</h3><p>See: <a href="https://quay.io/repository/cmihai/ubi8-httpd">https://quay.io/repository/cmihai/ubi8-httpd</a></p><h3>Customize the image information</h3><ul><li>Create a Description</li></ul><h3>Creating a build directly on quay.io</h3><ol><li>Go to <a href="https://quay.io/repository/cmihai/ubi8-httpd?tab=builds">https://quay.io/repository/cmihai/ubi8-httpd?tab=builds</a></li><li>Upload your build Dockerfile</li></ol><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4ef228bea87" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[GeoSpacial Engineering with Podman, PostGIS and QGIS]]></title>
            <link>https://medium.com/@crivetimihai/geospacial-engineering-with-podman-postgis-and-qgis-38d59c278c27?source=rss-7648462e917d------2</link>
            <guid isPermaLink="false">https://medium.com/p/38d59c278c27</guid>
            <dc:creator><![CDATA[Mihai Criveti]]></dc:creator>
            <pubDate>Fri, 04 Aug 2023 16:34:54 GMT</pubDate>
            <atom:updated>2023-08-04T16:34:54.912Z</atom:updated>
            <content:encoded><![CDATA[<h3>Docker PostGIS and PGAdmin</h3><p>In this article, I will show you how to:</p><ol><li>Create a Postgis docker image FROM postgres and publish it to <a href="https://hub.docker.com/">hub.docker.com</a>.</li><li>Create a Geospacial Database environment in Docker Compose with PostGIS and PGAdmin4.</li><li>Use PGAdmin4 to create a database an enable Geospacial extensions.</li></ol><h3>1. Build a PostGIS Docker Image</h3><ul><li>A Dockerfile contains instructions to build an image.</li><li>Start by extending an existing PostgreSQL 10 Debian image: FROM postgres:10.</li><li>Use apt-get to install required PostGIS extensions.</li><li>Use docker build to create the image, then push it to docker hub.</li></ul><h3>Creating the Dockerfile</h3><pre># Extend exiting PostreSQL 10 Debian image: https://hub.docker.com/_/postgres/<br>FROM postgres:10</pre><pre>MAINTAINER Mihai Criveti</pre><pre># Install PostGIS packages<br>RUN apt-get update<br>RUN apt-get install --no-install-recommends --yes \<br>    postgresql-10-postgis-2.4 postgresql-10-postgis-2.4-scripts postgresql-contrib</pre><h3>Building the image</h3><ul><li>Turn the Dockerfile into a usable image using docker build.</li><li>Tag the image with a namespace (the one used on Docker Hub): cmihai</li></ul><pre>docker build --tag cmihai/postgis .</pre><h3>Uploading the image to Docker Hub</h3><ol><li>Push the Dockerfile, README and docker-compose.yaml examples to github</li><li>Test the image end to end</li><li>Push the image to docker hub</li></ol><pre>export DOCKER_ID_USER=&quot;username&quot;<br>docker login<br>docker push<br>docker tag cmihai/postgis $DOCKER_ID_USER/my_image<br>docker push $DOCKER_ID_USER/my_image</pre><h3>2. Composing multiple images with docker compose</h3><ul><li>PGAdmin4 is a web based PostgreSQL Administration and SQL Development environment.</li><li>Docker Compose can link an existing <a href="https://hub.docker.com/r/dpage/pgadmin4/">dpage/pgadmin4</a> image from Docker Hub to cmihai/postgis</li><li>Login to <a href="http://localhost:5050/">http://localhost:5050</a> admin:admin after running docker-compose up</li></ul><h3>Create docker-compose.yaml</h3><pre>version: &#39;3.1&#39;<br>services:</pre><pre>    postgis:<br>        image: cmihai/postgis<br>        container_name: postgis<br>        ports:<br>            - &#39;5432:5432&#39;<br>        environment:<br>            POSTGRES_PASSWORD: postgres<br>        volumes:<br>            - pgdata:/var/lib/postgresql/data</pre><pre>    pgadmin4:<br>        image: dpage/pgadmin4<br>        container_name: pgadmin4<br>        ports:<br>            - &#39;5050:80&#39;<br>        environment:<br>            PGADMIN_DEFAULT_EMAIL: admin<br>            PGADMIN_DEFAULT_PASSWORD: admin<br>        links:<br>            - postgis</pre><pre>volumes:<br>  pgdata:</pre><h3>Starting the services</h3><p>Use docker-compose up to start the services:</p><pre>docker-compose up</pre><h3>3. Create a database and enable PostGIS with PGAdmin4</h3><ol><li>Login to pgadmin4: <a href="http://localhost:5005/">http://localhost:5050</a> with admin:admin</li><li>Add a connection to postgis with user/pass postgres:postgres</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1016/0*zPfomM4UsbphvGEU.png" /></figure><ol><li>Create a new database and call it gis</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Oe7UM8SCjo5kYjFF.png" /></figure><ol><li>Open the SQL Query Tool on the newly created gis database: In the Browser window, select <em>Servers &gt; postgis &gt; Databases &gt; gis</em>, the run <em>Tools &gt; Query Tool</em> from the <em>Menu</em>.</li><li>Run the following SQL code to enable postgis database extensions:</li></ol><pre>-- Enable PostGIS (includes raster)<br>CREATE EXTENSION postgis;</pre><pre>-- Enable Topology<br>CREATE EXTENSION postgis_topology;</pre><pre>-- Enable PostGIS Advanced 3D and other geoprocessing algorithms<br>CREATE EXTENSION postgis_sfcgal;</pre><pre>-- Fuzzy matching needed for Tiger<br>CREATE EXTENSION fuzzystrmatch;</pre><pre>-- Rule based standardizer<br>CREATE EXTENSION address_standardizer;</pre><pre>-- Example rule data set<br>CREATE EXTENSION address_standardizer_data_us;</pre><pre>-- Enable US Tiger Geocoder<br>CREATE EXTENSION postgis_tiger_geocoder;</pre><h3>Expected Outcome: gis database with geospacial extensions</h3><ul><li>Query returned successfully:</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*4hUDWMusVTidw7Cn.png" /></figure><ul><li><em>gis &gt; Extensions</em> now lists a number of GIS extensions: postgis, postgis_sfgal, postgis_tiger_geocoder, postgis_topology, fuzzystrmatch, address_standardizer and address_standardizer_data_us.</li><li><em>gis &gt; Schema &gt; public &gt; Functions</em> has been populated with a high number (1000+) of GIS specific functions.</li><li>A new table called spacial_ref_sys is now available under <em>gis &gt; Schemas &gt; public &gt; Tables</em>.</li><li>New schemas: tiger, tiger_data and topology have been created.</li></ul><h3>Next Steps:</h3><ul><li>Load Geospacial data from shapefile, KML, GeoJSON, etc.</li><li>Connect GIS Desktop clients such as QGIS.</li><li>Connect to PostGIS using Python (ex: geopandas).</li><li>Perform geospacial queries and analysis on the data.</li></ul><h3>Links and Reference:</h3><ul><li>Github Repository with Dockerfile and docker-compose.yaml: <a href="https://github.com/crivetimihai/geospacial-engineering">https://github.com/crivetimihai/geospacial-engineering</a></li><li>Docker Image: <a href="https://hub.docker.com/">https://hub.docker/com</a></li></ul><h3>A look at Open Data</h3><h3>Common GIS data formats:</h3><p>QGIS can be used to import a variety of formats in PostGIS, including Shapefiles, KML and GeoJSON.</p><ul><li>Shapefile: a popular geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a (mostly) open specification for data interoperability among Esri and other GIS software products.</li><li>KML: Keyhole Markup Language is an XML notation for expressing geographic annotation and visualization within Internet-based, two-dimensional maps and three-dimensional Earth browsers.</li><li>GeoJSON: a format for encoding a variety of geographic data structures.</li></ul><h3>Open Data Sources</h3><ul><li>Ireland’s Open Data Portal: <a href="https://data.gov.ie/">https://data.gov.ie/</a></li><li>Irish Spatial Data Exchange: <a href="http://isde.ie/">http://isde.ie</a></li><li>Census 2016 Open Data: <a href="http://census2016.geohive.ie/">http://census2016.geohive.ie/</a></li><li>European Data Portal: <a href="https://www.europeandataportal.eu/">https://www.europeandataportal.eu</a></li><li>2600+ Open Data sources: <a href="https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/">https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/</a></li></ul><h3>Interesting datasets:</h3><ol><li><a href="https://www.gsi.ie/en-ie/data-and-maps/Pages/Geohazards.aspx">Landslide / Geohazards data, from GSI</a></li><li><a href="https://data.gov.ie/dataset/railway-stations-osi-national-250k-map-of-ireland">Railway stations</a></li><li><a href="https://data.gov.ie/dataset/rail-network-osi-national-250k-map-of-ireland">Railway lines</a></li><li><a href="https://data.gov.ie/dataset/road-rail-intersections-osi-national-250k-map-of-ireland">Railway road intersections</a></li><li><a href="https://data.gov.ie/dataset/built-up-areas-osi-national-250k-map-of-ireland">Built up areas</a></li><li><a href="https://data.gov.ie/dataset/settlements-ungeneralised-osi-national-statistical-boundaries">Settlements</a></li><li><a href="https://data.gov.ie/dataset/centres-of-population-osi-national-placenames-gazetteer">Population</a></li><li><a href="https://data.gov.ie/dataset/gsi-groundwater-vulnerability">Groundwater vulnerability</a></li><li><a href="https://data.gov.ie/dataset/development-plans-dublin-city">Dublin development plans</a></li></ol><h3>Downloading data:</h3><pre># Download the data:<br>wget http://spatial.dcenr.gov.ie/GSI_DOWNLOAD/Landslide_Susceptibility_Map_Ireland.zip</pre><pre># Unzip the data:<br>unzip Landslide_Susceptibility_Map_Ireland.zip</pre><h3>Loading data with QGIS</h3><p>In this article, I will show you how to:</p><ol><li>Install QGIS on Ubuntu Linux.</li><li>Connect QGIS to PostgreSQL/PostGIS.</li><li>Import data (shapefiles, GeoJSON) into the GIS database using DBManager.</li></ol><h3>1. Install QGIS:</h3><p>On Ubuntu Linux, you can use:</p><pre>sudo apt-get update<br>sudo apt-get install qgis</pre><p>For other operating systems, follow the instructions listed at <a href="https://qgis.org/en/site/">https://qgis.org/en/site/</a></p><h3>Connect to PostGIS</h3><ul><li>Add PostGIS in QGis</li><li>Under <em>Browser, Right click </em>PostGIS<em> &gt; </em>New Connection* and select Name: postgis, Host: localhost, Port: 5432.</li><li>Save the connection details.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*mkW4MKlsmka53J3u.png" /></figure><h3>2. Import Landslide shapefile data into QGIS, then Posgres</h3><h3>Import the data in QGIS</h3><blockquote><em>First, we import the sample data into QGIS:</em></blockquote><ol><li>‘Layer &gt; Data Source Manager &gt; Home’ and find the layers</li><li>Select the layers you wish to add and click the Add Selected Layers button.</li></ol><h3>Export the data to PostgreSQL / PostGIS</h3><ol><li>Click on <em>Database &gt; DB Manager &gt; DB Manager</em></li><li>Select *PostGIS — yourdb &gt; your schema &gt; Table &gt; Import Layer / File and name it (ex: Landslide_Events)</li><li>Repeat step 2 for every layer you wish to import</li><li>Close the DB Manager</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*GCyaGyFOr_aX6Hki.png" /></figure><blockquote><em>Note: The import activity can take a long time. You can monitor progress in the PGAdmin4 Dashboard, by looking at the </em><em>Tuples In: Inserts graph:</em></blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*GPM6La_bo33N_sVl.png" /></figure><h3>Delete the layers — and load them from the DB</h3><ol><li>In Layers, <em>right click each layer — Remove</em></li><li>In <em>Browser &gt; Postgis &gt; postgis &gt; public</em> — double click each layer (in the right order).</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*DxcpRchyaKdyvm8o.png" /></figure><h3>Using the psql commandline client and other tools</h3><h3>List the networks. There should be a container running on the network postgis_default</h3><pre>docker network ls</pre><blockquote><em>2e2afc387fbb postgis_default bridge local</em></blockquote><h3>Connect to PostGIS using the psql client (from Docker):</h3><pre>docker run --net postgis_default -it --rm \<br>    --link postgis:postgis postgres psql -h postgis -U postgres -d gis</pre><h3>Using a local client:</h3><pre>psql -h localhost -U postgres -d gis</pre><h3>List Tables:</h3><pre>\dt</pre><h3>Run SQL Queries:</h3><pre>-- Events created by ABC, in descending order of creation date:<br>SELECT county, quaternary, slope_type, bedrock_ty, land_use_c, creationda, creator<br>    FROM public.&quot;Landslide_Events&quot;<br>    WHERE creator = &#39;ABC&#39; ORDER BY creationda DESC;</pre><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=38d59c278c27" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Guide to getting started with Docker, Python and Jupyter Notebook on zLinux.]]></title>
            <link>https://medium.com/@crivetimihai/guide-to-getting-started-with-docker-python-and-jupyter-notebook-on-zlinux-369da9d025ed?source=rss-7648462e917d------2</link>
            <guid isPermaLink="false">https://medium.com/p/369da9d025ed</guid>
            <dc:creator><![CDATA[Mihai Criveti]]></dc:creator>
            <pubDate>Fri, 04 Aug 2023 16:31:30 GMT</pubDate>
            <atom:updated>2023-08-04T16:31:30.854Z</atom:updated>
            <content:encoded><![CDATA[<p>Here, I’m using Red Hat Enterprise Linux 7.5 to build and deploy Jupyter notebook in an Ubuntu container. I will go over the steps used to build and run a Docker container.</p><p>Oh, and in case you’re wondering: why would anyone do this — check out this snippet from the <a href="https://developer.ibm.com/mainframe/2017/07/17/ibm-z-software-z14-announcement/">z14 announcement:</a> “Microservices can be built on z14 with Node.js, Java, Go, Swift, Python, Scala, Groovy, Kotlin, Ruby, COBOL, PL/I, and more. They can be deployed in Docker containers where a single z14 can scale out to 2 million Docker containers”.</p><p>A few basic commands:</p><p>Establish the OS release and version. We’re running on RHEL 7.5 for s390x.</p><pre>[cmihai@rh74s390x ~]$ cat /etc/redhat-release<br>Red Hat Enterprise Linux Server release 7.5 (Maipo)</pre><pre>[cmihai@rh74s390x ~]$ uname -a<br>Linux rh74s390x.novalocal 3.10.0-693.17.1.el7.s390x #1 SMP Sun Jan 14 10:38:29 EST 2018 s390x s390x s390x GNU/Linux</pre><pre>[cmihai@rh74s390x ~]$ docker --version<br>Docker version 17.05.0-ce, build 89658be</pre><h3>SETUP REGULAR USER ACCESS, SUDO AND SSH KEYS</h3><h3>CREATE A REGULAR USER ACCOUNT</h3><pre>useradd cmihai<br>passwd cmihai<br>usermod -aG wheel cmihai<br>su - cmihai</pre><h3>ADD YOUR SSH PUBLIC KEY TO AUTHORIZED_HOSTS</h3><pre>mkdir -p ~/.ssh<br>echo &quot;YOURKEYHERE&quot; &gt;&gt; ~/.ssh/authorized_keys</pre><h3>LOG IN AS YOUR NEW USER, AND FORWARD PORT 9000:</h3><pre>ssh -L 9000:https://www.linkedin.com/redir/invalid-link-page?url=127%2e0%2e0%2e1%3A9000 -i cmihai.pem cmihai@myzLinux</pre><h3>SETUP DOCKER</h3><h3>CREATE THE DOCKER GROUP</h3><pre>sudo groupadd docker<br>sudo usermod -aG docker cmihai</pre><h3>START DOCKER</h3><pre>sudo systemctl enable docker<br>sudo systemctl restart docker.service<br>sudo systemctl status docker.service</pre><h3>TEST DOCKER</h3><pre>docker run s390x/hello-world</pre><h3>LET’S RUN A SIMPLE UBUNTU INTERACTIVE SHELL:</h3><pre>docker run --name s390x-ubuntu --hostname s390x-ubuntu --interactive --tty s390x/ubuntu /bin/bash</pre><h3>BUILDING A DOCKER CONTAINER FOR JUPYTER NOTEBOOK</h3><p>Create a Dockerfile from the s390x/ubuntu base image.</p><pre>FROM s390x/ubuntu<br>MAINTAINER Mihai Criveti</pre><pre># ADD AND RUN<br>RUN apt-get update \<br>    &amp;&amp; apt-get install -y python3 python3-pip \<br>    &amp;&amp; pip3 install jupyter \<br>    &amp;&amp; apt-get clean</pre><pre># COMMAND and ENTRYPOINT:<br>CMD [&quot;jupyter&quot;,&quot;notebook&quot;,&quot;--allow-root&quot;,&quot;--ip=0.0.0.0&quot;,&quot;--port=9000&quot;]</pre><pre># NETWORK<br>EXPOSE 9000</pre><h3>BUILD THE CONTAINER:</h3><pre>docker build . --tag &quot;cmihai/jupyter-lite:v1&quot; -f Dockerfile</pre><h3>RUN YOUR NEW CONTAINER:</h3><pre>docker run --name jupyter --hostname jupyter -p 9000:9000 cmihai/jupyter-lite:v1</pre><h3>CONNECT TO JUPYTER NOTEBOOK</h3><p><a href="http://localhost:9000/">http://localhost:9000</a></p><h3>YOU CAN NOW INSTALL DEPEDENCIES DIRECTLY FROM JUPYTER:</h3><pre>!apt-get install --yes zlib1g-dev libjpeg-dev</pre><p>Potential next steps:</p><ul><li>Consider setting up persistence for your notebooks (ex: VOLUME [“/notebooks”] in Dockerfile)</li><li>Setup Docker Compose and build multi-tiered applications specifications — such as connecting your Jupyter Notebook to PostgreSQL, Redis, Spark, etc.</li><li>Set up other programming languages or kernels (Java, R) even Zeppelin Notebook</li></ul><p>For an interactive tutorial of using Docker for Data Science, check out: <a href="https://github.com/crivetimihai/docker-data-science">https://github.com/crivetimihai/docker-data-science</a></p><p>To see the original article, check out <a href="https://www.linkedin.com/pulse/data-science-environment-docker-jupyter-ibm-mainframe-mihai-criveti/">https://www.linkedin.com/pulse/data-science-environment-docker-jupyter-ibm-mainframe-mihai-criveti/</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=369da9d025ed" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>