- Get link
- X
- Other Apps
- Get link
- X
- Other Apps
This article explains every merge join type in Power Query step by step so you can choose the correct option for each scenario, avoid unexpected duplicates or missing rows, and build reliable data models directly from Excel or Power BI.
1. What the Merge operation does in Power Query
In Power Query, a merge combines two queries (tables) based on one or more key columns, similar to relational database joins or Excel lookup formulas.
Conceptually, a merge answers the question: “How do rows from Table A relate to rows from Table B when they share the same key value?”. The result depends entirely on the join type you choose.
1.1 Typical use cases for Merge
- Replacing VLOOKUP or XLOOKUP when preparing data before loading to Excel worksheets.
- Joining transactional data (e.g., Sales) with reference tables (e.g., Products, Customers).
- Comparing two versions of the same dataset (e.g., previous vs current month) to find new, changed, or removed rows.
- Combining exports from different systems that share a common identifier (e.g., Employee ID, SKU, Invoice Number).
1.2 Basic workflow of a Merge in Power Query
- Load both tables into Power Query.
- Select one table as the primary (the one you have active when you click Merge).
- Choose the second table to merge with.
- Select matching key column(s) in both tables.
- Select the join type (Left Outer, Inner, Full Outer, Left Anti, etc.).
- Expand the resulting nested table column to bring in the fields you need.
Note : The join type determines which rows are kept or removed before you expand columns, so choosing the wrong join type can silently drop data or introduce unexpected extra rows.
2. Overview of Power Query join types
When you open the Merge dialog in Power Query, you see several join type options. The names vary slightly by language, but conceptually they match classic SQL joins.
| Join type (UI label) | Technical description | Typical use case |
|---|---|---|
| Left Outer (all from first, matching from second) | Keep all rows from the first query, bring in matching rows from the second. | Standard lookup: master table on the left, lookup table on the right. |
| Right Outer (all from second, matching from first) | Keep all rows from the second query, bring in matching rows from the first. | Same logic as left outer, but with “primary” table logically on the right side. |
| Full Outer (all rows from both) | Keep all rows from both tables, whether matched or not. | Comparing datasets, reconciliation, overlap analysis. |
| Inner (only matching rows) | Keep only rows where the key exists in both tables. | Intersection of two lists, filtering to records present in both sources. |
| Left Anti (only in first) | Keep only rows that exist in the first table but not in the second. | Finding new or unmatched items from the primary table. |
| Right Anti (only in second) | Keep only rows that exist in the second table but not in the first. | Finding extra items that appear only in the comparison table. |
3. Visualizing join behavior with a simple example
Assume you have two simple tables.
3.1 Customer table (first query)
| CustomerID | CustomerName |
|---|---|
| 101 | Alpha Ltd |
| 102 | Beta Inc |
| 103 | Gamma Co |
3.2 Sales table (second query)
| CustomerID | SalesAmount |
|---|---|
| 101 | 1000 |
| 101 | 500 |
| 104 | 200 |
Join key: CustomerID.
3.3 Result shapes by join type
| Join type | Rows returned | Key rows included | Comment |
|---|---|---|---|
| Left Outer | 4 rows | 101 (twice), 102, 103 | All customers kept. Customer 101 duplicated because there are two matching sales rows. |
| Right Outer | 4 rows | 101 (twice), 104 | All sales rows kept, including a sale for 104, even though 104 is not in Customers. |
| Full Outer | 5 rows | 101 (twice), 102, 103, 104 | Union of both lists, preserving all records from both sides. |
| Inner | 2 rows | 101 (twice) | Only customers that have at least one sale. |
| Left Anti | 2 rows | 102, 103 | Customers with no matching sales. |
| Right Anti | 1 row | 104 | Sales for a customer not defined in the customer master. |
Note : If you see more rows after a merge than you had originally in the first table, it usually means there are multiple matching rows on the other side of the join. This is not an error, but a direct consequence of relational join logic.
4. Left Outer join: the default “lookup-style” merge
The Left Outer join is the most frequently used merge type in Power Query because it mirrors common lookup operations in Excel.
4.1 When to use a Left Outer join
- You have a primary table (e.g., transactions) and a lookup table (e.g., product attributes) and you want all rows from the primary table to remain.
- You want missing matches to appear as null rather than removed rows.
- You are replacing VLOOKUP-type logic with a more robust query-based approach.
4.2 Step-by-step outline (conceptual)
- Select the primary query (e.g., Sales).
- Home > Merge Queries.
- Select the lookup query (e.g., Products).
- Select matching key columns on both sides.
- Select “Left Outer (all from first, matching from second)”.
- Click OK, then expand the new column to bring in product attributes.
4.3 Handling one-to-many relationships
If each key from the primary table matches multiple rows in the lookup table, the resulting merged table will contain duplicates of the primary rows. To prevent this, ensure that the lookup table is unique by key (e.g., group and aggregate, or remove duplicates by key) before performing the merge.
Note : Before using a Left Outer join for lookups, run “Remove Duplicates” on the key column of the lookup table to enforce a one-row-per-key rule where that makes business sense.
5. Inner join: intersecting two lists
An Inner join returns only the rows that have matching key values in both queries.
5.1 Practical uses of Inner joins
- Finding customers that exist in both CRM and billing systems.
- Filtering transactions to only those that have valid master data entries.
- Extracting a list of IDs that appear in both a “candidate” file and a “final” file.
5.2 Pitfalls with Inner joins
- Any row with a missing match on either side is removed entirely.
- If you expect the same row count as the primary table, an Inner join may surprise you by reducing the row count.
Note : Use an Inner join only when you are intentionally interested in the intersection of the two lists. If you want to retain all rows from the primary table regardless of matches, use a Left Outer join instead.
6. Full Outer join: complete reconciliation
A Full Outer join returns all rows from both tables, matching them where possible. Non-matching rows are kept with nulls on the side where no match exists.
6.1 When a Full Outer join is appropriate
- Comparing two exports of the same logical data (e.g., two systems reporting inventory).
- Reconciling vendor statements against your internal records.
- Auditing data migrations, to ensure no rows are lost or added unexpectedly.
6.2 Typical post-processing steps after a Full Outer join
- Creating status columns such as “Only in A”, “Only in B”, “In Both” using conditional logic.
- Aggregating differences by key to quantify total mismatches.
- Filtering to specific categories (e.g., only rows that exist in one side but not the other).
// Example: pseudo-M code pattern for status classification = Table.AddColumn( FullOuterTable, "MatchStatus", each if [Key] <> null and [Key.1] <> null then "In Both" else if [Key] <> null then "Only in A" else "Only in B" ) 7. Anti joins: finding unmatched rows
Anti joins are powerful for identifying differences between datasets.
7.1 Left Anti join
A Left Anti join returns rows that exist in the first table and have no match in the second.
- Examples:
- Customers in your CRM who have no invoices.
- Items in a master list that have never been sold.
- IDs in a control list that are missing from an operational file.
7.2 Right Anti join
A Right Anti join returns rows that exist in the second table and have no match in the first.
- Examples:
- Transactions arriving from a new system that have no corresponding master data entries.
- Codes that appear only in a vendor file but not in your internal reference list.
Note : Anti joins never bring in columns from the other table, because by definition there is no matching row to merge. They are used purely for detecting the presence or absence of keys.
8. Join types in M code (Table.Join)
The Merge dialog generates an M expression using the Table.Join or Table.NestedJoin functions. Join types are specified through the JoinKind enumeration.
// Example structure generated by Power Query for a Left Outer join = Table.NestedJoin( FirstTable, {"KeyColumn"}, SecondTable, {"KeyColumn"}, "SecondTable", JoinKind.LeftOuter ) Common JoinKind values include:
JoinKind.LeftOuterJoinKind.RightOuterJoinKind.FullOuterJoinKind.InnerJoinKind.LeftAntiJoinKind.RightAnti
Understanding these constants is useful when you edit M code directly, copy logic between queries, or build functions that perform dynamic joins.
9. Choosing the right join type: decision framework
The key to “merge join types explained” in a practical way is to map common business questions to the correct join type.
| Business question | Recommended join type | Reason |
|---|---|---|
| “I want to add descriptive fields (names, categories) to every transaction, and keep all transactions.” | Left Outer | Transactions are the primary table; keep them all regardless of match. |
| “I only want transactions that have valid master records.” | Inner | Remove any row that does not have a matching master record. |
| “I need to see entries that exist in one file but not the other.” | Full Outer, or combination of Left Anti and Right Anti | Full Outer shows all rows; anti joins isolate unpaired rows. |
| “I want to find customers with no purchases.” | Left Anti (Customers vs Sales) | Return customers for which no matching row exists in Sales. |
| “I want only the overlap list between two ID sets.” | Inner | Intersection of both lists. |
Note : It is often safer to start with a Full Outer join when diagnosing data-quality issues, then refine to Left Outer, Inner, or Anti joins once you understand the structure and completeness of both tables.
10. Common pitfalls and how to avoid them
10.1 Unexpected duplicates after merging
Symptom: The row count in the merged result is larger than the row count of the primary table.
Cause: Many-to-one or many-to-many relationships between the two tables on the selected key columns.
Mitigation steps:
- Check whether the key is truly unique on the lookup side.
- If not unique, consider:
- Aggregating on the lookup table before the merge (e.g., group by key and summarize).
- Redefining the key as a combination of multiple columns to ensure uniqueness.
10.2 Missing matches due to inconsistent key formats
Symptom: You expect many matches, but most rows in the merged column are null.
Typical causes:
- Different data types on both sides (text vs number).
- Leading/trailing spaces or non-printing characters.
- Case differences in case-sensitive environments.
Preventive transformations:
- Convert both key columns to the same data type before merging.
- Apply trims and standardization, for example:
// Common key cleaning transformations = Table.TransformColumns( Source, { {"KeyColumn", each Text.Upper(Text.Trim(Text.From(_))), type text } } ) Note : Always standardize and clean join keys before merging, especially when data comes from different systems or manual inputs.
10.3 Choosing the wrong primary table
Symptom: After a Left Outer join, many expected rows are missing.
Cause: You selected the table with fewer rows as the first query, even though you intended to keep all rows from the larger dataset.
Solution: The first query in the Merge dialog should be the table whose rows you want to preserve when using Left Outer or Left Anti joins. Reverse the merge order if necessary.
11. Performance considerations for large merges
Merge operations on large datasets can be expensive. Efficient modelling and proper join choices help maintain reasonable refresh times.
11.1 Reduce data before merging
- Filter both tables to the relevant period or subset before the Merge.
- Remove unnecessary columns to shrink memory footprint.
- Aggregate detailed transaction tables when only summaries are needed.
11.2 Indexing and sort order
- Sorting is not required for joins, but can help when debugging or validating match logic.
- When merging extremely large tables from certain data sources, database-side joins may be more efficient than performing them inside Power Query after importing all data.
11.3 Incremental refresh and staging
- For Power BI models, use staging queries and incremental refresh to avoid repeatedly merging the entire historical dataset.
- For Excel-based solutions, consider staging intermediate results to files or tables that are reused across workbooks.
FAQ
Which join type in Power Query is closest to VLOOKUP in Excel?
The join type that most closely replicates a traditional VLOOKUP is the Left Outer join. You place the transaction or detail table on the left, the lookup table on the right, merge on the key, and then expand the fields you want to bring into the main table.
How do I avoid duplicate rows after a merge?
Duplicates usually indicate that the key is not unique in the lookup table. First, check the lookup table for multiple rows with the same key. If duplicates are not desired, either remove duplicates by key or group and aggregate the table before merging. Always confirm with the business context whether duplicates are logically valid.
When should I use a Full Outer join instead of two Anti joins?
A Full Outer join gives you a single unified table that contains all rows from both sources, including matched and unmatched ones. This is useful for a high-level reconciliation view. If you only care about rows that do not match, two targeted Anti joins (Left Anti and Right Anti) make it easier to isolate unmatched records on each side without the noise of matched rows.
Can I change the join type later without recreating the merge?
Yes. After creating a merge, you can open the Advanced Editor and change the JoinKind value in the Table.NestedJoin or Table.Join function. Alternatively, delete the applied step and recreate the merge from the UI with a different join type. Editing the M code is more efficient when you need to test multiple join types quickly.
How do I debug why certain rows are not matching?
A systematic approach is effective. First, perform a Full Outer join on a limited sample to see all combinations. Then add a status column to indicate whether each row is “Only in A”, “Only in B”, or “In Both”. Inspect several examples from the unmatched groups, looking for differences in data type, spaces, punctuation, or formatting in the key columns. Apply standardization steps and repeat the merge on a small sample until the results are as expected, then roll those transformations into the full dataset.
추천·관련글
- Suppress Solvent Peak Interference in NMR: Proven Solvent Suppression Techniques and Settings
- Handle Moisture Contamination of Reagents: Proven Drying Methods, Testing, and Storage Best Practices
- How to Extend HPLC Column Life: Proven Maintenance, Mobile Phase, and Sample Prep Strategies
- Lithium Dendrite Safety: Diagnosis, Mitigation, and Emergency Response
- GC Peak Tailing Troubleshooting: Proven Fixes for Sharp, Symmetric Peaks
- Correct Curved ICP-OES Calibration: Expert Methods to Restore Linearity
data modeling power query
excel power query tutorial
merge queries in excel
power query join types
power query merge
- Get link
- X
- Other Apps