Implementation of Python-like zip() Function in C++

2024-06-21C++

Introduction

The range-based for loop introduced in C++11 is a convenient notation that allows concise access to all elements of a container or array.

std::vector<int> a = [1, 2, 3];

for(auto & i : a) {
    std::cout << i << std::endl;
}

While the range-based for loop is a convenient notation, it can only be used for a single container or array. When using languages like Python, the existence of the zip() function, which allows handling multiple lists or dictionaries in a single for loop, can often be very convenient.

a = [1, 2, 3]
b = ["aaa", "bbb", "ccc"]

for i, j in zip(a, b):
  print(f"{j}: {i}")

On the other hand, C++ has a feature called structured bindings, introduced in C++17. Structured bindings allow you to easily decompose and assign elements of a tuple.

std::tuple<int, std::string> a = {1, "aaa"};

auto [i, j] = a;
std::cout << j << ": " << i << std::endl;

By utilizing this, you can achieve similar functionality. Thus, I created a Zipper class, a container that groups multiple containers together using tuples, and a Zip() function that returns an instance of the Zipper class.

Implementation and Usage

Implementation

Please refer to the source code I uploaded on GitHub for the actual implementation.

To apply the range-based for loop to a custom container, you need to implement begin() and end() member functions for the container. These functions must return iterators.

The iterators returned by these functions must at minimum define the following operators.

  1. != operator
  2. ++ operator
  3. * operator

That is, they should have the following structure.

template <typename T>
class MyContainer
{
public:
    class Iterator
    {
    public:
        bool operator !=(Iterator) { ... };
        Iterator & operator ++() { ... };
        T & operator *() { ... };
    };

    Iterator begin() { ... }
    Iterator end() { ... }
};

In the Zipper class, all of these features are implemented.

The Zipper class takes containers as variadic templates, and the return type of the * operator for iterators is a tuple of elements from each container. The element type of a container can be accessed using T::iterator::value_type or T::iterator::reference (in the case of a reference type), but this approach doesn’t work for arrays. To support arrays as well, I use GetReference<> from the previous article to obtain the element type.

I’ve also made enhancements to the != operator for iterators. The Zipper::Iterator class holds a tuple of iterators from each container as its member variables, managing the current position effectively. While tuples have comparison operators that could directly compare them, this approach may lead to issues when containers have different sizes, especially when comparing with Zipper::end(). To handle this, I’ve defined Zipper::EndIterator as a separate class for Zipper::end()'s return value. Instead of directly comparing tuples, I implemented specialized logic by comparing Zipper::Iterator with Zipper::EndIterator. Since we need to break out of the loop once any iterator reaches the end, I implemented the == operator to check against Zipper::end() and used its negation for the != operator.

Parameter pack expansion can be used for comparison with Zipper::end(). The following example shows how to create a bool array using parameter pack expansion.

bool IsEnd(const EndIterator & it, std::index_sequence<N...>) const
{
    bool chk[] = {(std::get<N>(iter_) == std::get<N>(it.iter_))...};

    return std::any_of(std::begin(chk), std::end(chk), [](bool x) { return x; });
}

To evaluate this array, the std::any_of() algorithm is used.

Usage

Here’s an example of usage assuming zipper.h is in the same directory.

#include <iostream>
#include <list>
#include <map>
#include <string>
#include <vector>

#include "zipper.h"

int main()
{
    std::vector<int> a = {1, 2, 3};
    std::list<std::string> b = {"a", "b", "c"};
    std::map<int, std::string> c = {{0, "abc"}, {1, "def"}, {2, "ghi"}};

    for (auto [s, t, u] : Zip(a, b, c)) {
      std::cout << s << " " << t << " " << u.first << " " << u.second << std::endl;
    }

    return 0;
}

Conclusion

I tried creating a Python-like Zip() function using tuples. This allows handling multiple containers together in a single range-based for loop.

Note that in C++23, the std::views::zip() function is standardized in <ranges>. It’s advisable to use that if you’re in an environment that supports C++23.

std::vector<int> a = {1, 2, 3};
std::list<std::string> b = {"aaa", "bbb", "ccc"};

for(auto & [i, j] : std::views::zip(a, b)) {
    std::println("{}: {}", j, i);
}

C++

Posted by izadori