Welcome to Behind the Mutex! Our weekly newsletter summarizes new notable activity in open-source, suggest pro-tips and highlights interesting tweets.
In Case You Missed It
This section covers Behind the Mutex posts of the past week.
Open-Source Landscape
Behind the Mutex picks a few categories and explores new projects and features there.
DataOps
SQLMesh: https://github.com/TobikoData/sqlmesh
A new solution to address a whole class of problems related to data engineering. Armed with well-known practices and tools in DevOps, SQLMesh aims to apply those to streamline and optimize maintenance of data transformation pipelines and lineage. Similarly to Terraform, the product allows users to plan the changes to their modes, test the changes and apply them to your production environment.
AWS Redshift, GCP BigQuery, Snowflake and Spark are supported. A GitHub Action for your CI/CD pipelines is also available.
The SQLMesh team has just added a Web-based IDE where users can manage their models, plan and apply changes to them via the new user-friendly UI.
Behind the Mutex will be covering SQLMesh in detail in our upcoming reviews. The reviews are published on the first and third Thursday every month. Sign up to receive our deep-dive into the codebase.
LLMOps
The community around LLM applications and its activity in open-source and on Twitter is like a force of nature right now. Many enthusiasts are experimenting with prompts using LangChain and similar projects. Others are trying to build their LLM-driven agents, index their information sources for semantic search and even automate decision-making processes.
Here’s a few noteworthy open-source projects in LLMOps that have had our attention this week:
LangChain: https://github.com/hwchase17/langchain
The LangChain contributors have been doing an amazing job building and releasing new versions of the package almost every day over the past week.
I highly recommend skimming through the next new additions to the project’s documentations here:
Custom Agent with Tool Retrieval offers a way to scale the list of available tools within your prompts by replacing it with a special agent that picks the most relevant tools from a previously prepared tool store.
Natural Language API Toolkits shows how multiple OpenAPI agents can be combined into an agent that exposes those underlying APIs via a natural language.
Also, some notable additions this week were Improved Redis Memory, Table Indices in SQL Chain, ElasticSearchBM25Retriever, TFIDFRetriever.
⚠️ WARNING: the pace of contributions to the project is quite high, but the quality bar for the incoming PRs to be merged into the codebase appears to be really low. You will not find a single test in the mentioned PRs above, and this leads to all sorts of consequences for using the package in production. Please be cautious and consider the risks.
If you are interested in LangChain, don’t miss our deep-dive review of this project’s codebase, its most critical and fundamental pieces. The review will be published next Thursday, April 20.
Flux: https://github.com/transmissions11/flux
For those focused on prompt engineering and experimentation, Flux might be quite relevant. This is a Web-based visual environment for exploring and debugging LLM prompts. The user can build entire tree-like interactions with their models, simultaneously generate multiple LLM responses, tune the temperature and more. You can find the free publicly available version hosted at https://flux.paradigm.xyz. Alternatively, you can easily run it locally with Node.js.
LangFlow: https://github.com/logspace-ai/langflow
LangFlow is a Web UI for designing and experimenting with LangChain and its various components. The tool looks similar to Flux, but offers a more high-level blocks to combine. The user can drag and drop components onto the current Flow, and then configure and connect them according to their semantics. LangFlow is distributed as a PyPi package and can be installed right into your LangChain project.
LangChain AI Plugin: https://github.com/langchain-ai/langchain-aiplugin
Inspired by ChatGPT Retrieval Plugin, LangChain AI Plugin offers users a way to interact with their LangChain agents right in their ChatGPT conversations.
DevOps
HeadScale: https://github.com/juanfont/headscale
Those who have been tracking various applications of the WireGuard VPN protocol may find HeadScale as an interesting alternative to TailScale to host their private peer-to-peer networks. An instance of HeadScale represents a self-hosted TailScale control server and is able to manage a single network. If you need to build a mesh network of machines within an organization and don’t require a GUI, HeadScale is a viable option.
Localstack: https://github.com/localstack/localstack
When it comes to integration testing, engineers often tend to substantially mock any access to 3rd-party services, in order to properly cover their functionality. There are plenty of tools that can help with that. In context of AWS, there is always an option to mock each such service separately. For example, Minio can be used as a drop-in replacement for S3. Localstack offers a single container image that can serve a whole set of AWS APIs.
Its most recent release brings revamped and more performant S3 and Lambda APIs.
Pulumi: https://github.com/pulumi/pulumi
If you are tired of describing your infrastructure in DSL such as HCL and its limitations, Pulumi might be worth checking out. It provides means to manage your resources in your imperative language of choice. You can even describe your stack in multiple languages. The number of supported Cloud and SaaS providers is on par with other strong players in this area.
If you have any feedback or would like to see certain open-sources projects highlighted in our upcoming summaries and reviews, please feel free to comment, send an email or DM the author on Twitter @dalazx.
Until next Tuesday,
Behind the Mutex.