Scaling Code Quality: Empowering Evermos Data Team with SQLFluff

Muhammad Ariq Naufal
evermos-tech
Published in
6 min readMar 19, 2024

--

by Muhammad Ariq Naufal and Rendy B. Junior

“Maintaining good code quality is a shared responsibility.”

Having good code quality can improve the readability and consistency of the code that the company has determined, making it easier for us as developers to understand the contents of the code.

The Evermos data team already has an SQL convention that aims to regulate how code is written at Evermos. However, with the increasing size of the team and the increasing frequency of new model creation every day, relying solely on manual reviews will increase several issues, namely:

  • Disrupted team productivity: Conducting code reviews requires attention to detail, so with the data team growing and more frequent model creation, manual reviews become a time bottleneck, hindering developers’ ability to focus on core tasks like data analysis and model development.
  • Increased human error: Manual reviews, while valuable, are inherently prone to human oversight and inconsistency. Inconsistencies in interpretation and potential fatigue could lead to missed errors or inconsistencies slipping through the cracks, compromising code quality and potentially impacting downstream analysis and models.
  • Limited scalability: As the data volume and complexity grow, manual reviews become increasingly unsustainable. The team needs a solution that can scale efficiently without sacrificing code quality or requiring an unreasonable time commitment from team members.

Given the concerns mentioned above, the Evermos data team searched for the best SQL linter on the internet and decided to use SQLFluff. SQLFluff can automatically check and standardize code, allowing the data team to focus on tasks that have a greater impact on the company.

SQLFluff is an open source SQL linter and formatter that helps improve the quality of SQL code by enforcing best practices, readability, and consistency.

With implementing SQLFluff into our data modeling workflow, we hoped to achieve a balance between automation and human expertise, allowing our team to remain productive, efficient, and deliver data insights with confidence.

Choosing the right SQL linter

Before choosing SQLFluff, we explored several solutions to address the bottlenecks of manual code reviews. Here were our top 3 options:

Ultimately, SQLFluff offered the best balance of automation, scalability, and flexibility to meet our specific needs. It addressed the core issues of manual reviews while providing enough customization options to fit our established conventions. However, the best choice will depend on your unique team setup, codebase size, and specific coding standards.

Implementing SQLFluff to our workflow

Before SQLFluff, our model creation workflow relied primarily on code reviews by peers or leads within each team. However, with the implementation of SQLFluff, the data team can now focus on reviewing the logic of the code without worrying about the code quality, which is now checked by SQLFluff.

PR Workflow on Evermos

SQLFluff has a steep learning curve, so before we directly implement the tools, there are few things to be prepared for.

Preparation:

  1. Customizing SQLFluff Rules: The team carefully tailored the existing SQLFluff rules to align with our SQL conventions. This involved leveraging the core rules offered by SQLFluff while removing irrelevant ones (like LT12 and LT05). Additionally, we implemented new rules (like ST07 and LT04) to address specific formatting and style preferences within our team.
  2. Defining Code Review Scope: A practical approach was taken by focusing code reviews solely on changes and newly added files, rather than the entire codebase. This decision considered the time constraints associated with a comprehensive review.
  3. GitHub Action Integration: For seamless integration with the existing workflow, SQLFluff was incorporated into GitHub Actions. This eliminated the need for context switching during code reviews and ensured efficient processing. GitHub Actions automatically execute SQLFluff checks on every pull request submitted.
  4. Documentation and Training: Recognizing the importance of user adoption, we invested in creating detailed documentation and conducting a dedicated workshop. This initiative effectively utilizes the tool while addressing any potential issues that may arise.

Implementation:

  1. Checking SQLFluff Results: After creating a pull request, team members must check the results of the SQLFluff GitHub Action. This ensures that all code changes adhere to the established SQL conventions.
  2. Making Necessary Changes: If any errors are detected by SQLFluff, the team member must make the necessary changes to the code. This may involve fixing formatting issues, addressing syntax errors, or modifying code to comply with specific rules.
  3. Pushing the Pull Request: Once all errors have been addressed and the code passes the SQLFluff checks, the pull request can be pushed for further review and merging.

By following these steps, our team has successfully implemented SQLFluff, streamlining code reviews, improving code quality, and ensuring consistent coding practices across the data team.

Challenges and Learnings

While SQLFluff proved to be an effective tool for code review, the initial implementation presented several challenges:

  • Overwhelming Error Volume: During the first implementation, a significant number of errors were flagged, making it difficult to identify and address changes within pull requests.
  • Macro Readability Issues: SQLFluff faced difficulties in reading dbt macros, particularly with incremental macros, leading to false error flags related to indentation.
  • Steep Learning Curve: The initial implementation required significant time and effort to understand and address the large number of errors, creating a sense of overwhelm for the team.

Learnings and Solutions:

To overcome these challenges, the team adopted the following strategies:

  • Splitting Commits: Separating code changes and SQLFluff fixes into separate commits improved readability and facilitated targeted code review.
  • Disabling QA for Macros: Using the — no-qa flag allowed the team to ignore SQLFluff checks for specific macros, such as incremental macros, mitigating false error flags.
  • Gradual Adoption: Implementing SQLFluff gradually allowed the team to familiarize themselves with the tool and address errors iteratively, reducing the initial workload and promoting sustainable adoption.
  • Leveraging Documentation: The team extensively leveraged SQLFluff’s documentation and community resources to understand error messages, troubleshoot issues, and find best practices.
  • Encouraging Collaboration: Team members actively shared knowledge and solutions through regular discussions and code reviews, fostering a culture of continuous learning and improvement.

Impact and Benefits

By implementing SQLFluff, the Evermos data team has successfully addressed the bottlenecks and limitations of manual code reviews. We’ve achieved:

  • Enhanced Productivity: Automating repetitive tasks has freed up valuable developer time for high-impact data analysis and model development.
  • Improved Code Quality: Consistent enforcement of coding standards minimizes human error and ensures a high-quality codebase.
  • Scalable Solution: SQLFluff adapts seamlessly to our growing team and data complexity, allowing us to maintain code quality efficiently.

Few testimonials from our happy data team 😀

Overall, SQLFluff has been a valuable tool for the Evermos data team. It has helped us to improve code quality, increase developer productivity, and enhance code consistency.

Looking Ahead: We’re committed to continuous improvement and innovation within our data team. SQLFluff represents a step towards a more efficient and collaborative data management environment.

If you are facing similar challenges with manual code reviews, I encourage you to explore SQLFluff and other available solutions to find the best fit for your team.

Are you passionate about data and eager to contribute to a dynamic team leveraging cutting-edge tools? Evermos is always looking for talented individuals to join our data team. If you’re excited about tackling real-world challenges and shaping the future of data analysis at Evermos, we encourage you to explore our careers page!

--

--