# Semi-join Subquery Optimizations

MariaDB has a set of optimizations specifically targeted at *semi-join subqueries*.

## What is a Semi-Join Subquery

A semi-join subquery has a form of

```sql
SELECT ... FROM outer_tables WHERE expr IN (SELECT ... FROM inner_tables ...) AND ...
```

that is, the subquery is an IN-subquery and it is located in the WHERE clause. The most important part here is

*with semi-join subquery, we're only interested in records of outer\_tables that have matches in the subquery*

Let's see why this is important. Consider a semi-join subquery:

```sql
SELECT * FROM Country 
WHERE 
  Continent='Europe' AND 
  Country.Code IN (SELECT City.country 
                   FROM City 
                   WHERE City.Population>1*1000*1000);
```

One can execute it "naturally", by starting from countries in Europe and checking if they have populous Cities:

![semi-join-outer-to-inner](/files/h2JWwTqullF0dDc5yTdh)

The semi-join property also allows "backwards" execution: we can start from big cities, and check which countries they are in:

![semi-join-inner-to-outer](/files/wbE9XTUYMLBFMgsm8CNv)

To contrast, let's change the subquery to be non-semi-join:

```sql
SELECT * FROM Country 
WHERE 
   Country.Continent='Europe' AND 
   (Country.Code IN (SELECT City.country 
                   FROM City WHERE City.Population>1*1000*1000) 
    OR Country.SurfaceArea > 100*1000  -- Added this part
   );
```

It is still possible to start from countries, and then check

* if a country has any big cities
* if it has a large surface area:

![non-semi-join-subquery](/files/zMLzSWjAaFhEt31o29Gt)

The opposite, city-to-country way is not possible. This is not a semi-join.

### Difference from Inner Joins

Semi-join operations are similar to regular relational joins. There is a difference though: with semi-joins, you don't care how many matches an inner table has for an outer row. In the above countries-with-big-cities example, Germany will be returned once, even if it has three cities with populations of more than one million each.

## Semi-Join Optimizations in MariaDB

MariaDB uses semi-join optimizations to run IN subqueries.The optimizations are enabled by default. You can disable them by turning off their [optimizer\_switch](/docs/server/server-management/variables-and-modes/server-system-variables.md#optimizer_switch) like so:

```sql
SET optimizer_switch='semijoin=off'
```

MariaDB has five different semi-join execution strategies:

* [Table pullout optimization](/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/subquery-optimizations/table-pullout-optimization.md)
* [FirstMatch execution strategy](/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/optimization-strategies/firstmatch-strategy.md)
* [Semi-join Materialization execution strategy](/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/optimization-strategies/semi-join-materialization-strategy.md)
* [LooseScan execution strategy](/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/optimization-strategies/loosescan-strategy.md)
* [DuplicateWeedout execution strategy](/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/optimization-strategies/duplicateweedout-strategy.md)

## See Also

* [Subquery Optimizations Map](/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/subquery-optimizations/subquery-optimizations-map.md)
* ["Observations about subquery use cases"](https://s.petrunia.net/blog/?p=35) blog post
* [http:en.wikipedia.org/wiki/Semijoin](https://en.wikipedia.org/wiki/Semijoin)

<sub>*This page is licensed: CC BY-SA / Gnu FDL*</sub>

{% @marketo/form formId="4316" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/subquery-optimizations/semi-join-subquery-optimizations.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
