Dolma Party

[Türkçe] [code]
[Substack]

I recently attended the 23rd edition of an annual dolma-themed dinner party. As you can imagine, this has become a tradition for some people – I was just two years old when the first one took place. Everyone was required to bring a homemade dolma. A “homemade” dolma bought from a supermarket didn’t count: “your dolma must be homemade by your hands,” per an official email from the organizers. I spent a nontrivial amount of time preparing my contribution to the party, so hopefully cooking dolma builds character or something.

I helped out with the registration process during the party. Newcomers would receive a label for their dolma before placing it at the dining area. Importantly, this label contained dietary information in the form of four parameters: “vegan,” “vegetarian,” “no red meat,” and “no dairy.” This blog post is about some observations and thinking on this dietary information.

First, we observe that not all $2^4 = 16$ combinations are possible here. “Vegan” implies the rest, so there cannot be a dolma that contains red meat, but is vegan at the same time. If someone manages to make such a dolma, that would be a hell of an achievement. So, which combinations are valid? First, we will lay down all possible combinations. Then, for each combination, we will either argue that it is impossible or list some ingredients that a dolma can/must have in order to satisfy that combination. To aid with this, I drew the following Venn diagram:

Let’s go through the various colors and letters on this diagram:

Now that we know that certain regions are impossible, let’s simplify the Venn diagram:

The numbers denote how many dolmas there were in each category.

Note that two categories (marked with “$0$” and “$\underline{0}$”) were not seen at the dolma party. To show why they are still possible, we need to give examples of dolmas that would be in these categories:

To conclude this part, we found that among $2^4 = 16$ combinations, only $7$ of them are possible. Then, we counted how many dolmas are in each category. For categories that did not appear in the 23rd dolma party, we gave examples of ingredient combinations that can be used to make dolmas in those categories. This last bit should not be seen as a suggestion for future dolma parties.

The next step is to address a situation that piqued my interest while drawing the Venn diagrams above.

New categories for a more homogeneous distribution

We observe that the distribution of dolmas among the categories is skewed: a lot of dolmas accumulated in certain categories, while some categories had no dolmas. This brings us to the following question:

“Are there other categories that would have distributed the dolmas more equally among each other?”

We need to explain/clarify a few things first. During the party, there was technically a two-step process: first, four parameters were defined, and second, each possible combination of these four parameters was called a “category.” In this analysis, we will define the categories directly.

Let’s model foods as binary vectors over ingredient presence:

Beef Chicken meat Tomato Olive oil Milk Butter Honey
Food 1 1 0 1 0 1 0 1
Food 2 0 1 1 0 0 1 1
Food 3 1 1 0 1 0 0 1
Food 4 0 0 1 1 1 0 0
Food 5 1 0 0 1 0 1 0

In this setting, a category is a Boolean expression that describes which ingredients are allowed/not allowed. Importantly, we won’t allow categories to intersect, just like how the regions in the Venn diagrams were not intersecting. It is because of this rule that we called “vegan,” “vegetarian,” “no red meat” and “no dairy” not as categories, but as parameters.

For the table above, we have the following definitions for the four parameters at the party:

Of course, if we had other ingredients in our input data, we would need different definitions. For example, in another input where eggs are an ingredient in a food, we would have to add NOT "egg" to the definition of “vegan.”

Note that the four parameters that we used to classify the dolmas made sense – these are actual dietary restrictions some people follow. While looking for categories that split the dolmas more equally among each other, we will not care if these categories are meaningful from a culinary perspective. We are done with macadamia, now is the time for academia.

We will use linear programming. Let’s first clarify what our algorithm will receive as input. We are given how many categories we are supposed to split the dolmas into. We also know the ingredients of each dolma. As an implication, we also know which ingredients were used for the entire party. In more technical terms, here is our input data:

The goal:

The variables:

Objective function: $$ \begin{aligned} \min(S_{\text{max}} - S_{\text{min}}) \end{aligned} $$

Constraints:

Running the algorithm

Here is an implementation of the constraint system above using PuLP in Python. The code comes with a uniformly-sampled matrix representing 30 foods and 30 ingredients, the dimensions of which were meant to roughly represent a dolma party.

For the readers to get a sense of the output of the algorithm, below is what you get when you tell the algorithm to split the foods in the 30 foods by 30 ingredients matrix (the one provided in the code) into 5 categories:

Result - Optimal solution found

Objective value:                0.00000000
Enumerated nodes:               2121
Total iterations:               113678
Time (CPU seconds):             21.71
Time (Wallclock seconds):       21.77

Option for printingOptions changed from normal to all
Total time (CPU seconds):       21.73   (Wallclock seconds):       21.80

Objective (S_max - S_min): 0.0

Category 0:
  Food 0
  Food 3
  Food 4
  Food 13
  Food 16
  Food 18
  Rule:
    NOT Ingredient 0
    NOT Ingredient 1
    HAS Ingredient 3
    HAS Ingredient 27

Category 1:
  Food 1
  Food 6
  Food 7
  Food 11
  Food 21
  Food 24
  Rule:
    HAS Ingredient 8
    HAS Ingredient 20
    HAS Ingredient 21
    NOT Ingredient 23

Category 2:
  Food 2
  Food 17
  Food 19
  Food 20
  Food 26
  Food 27
  Rule:
    HAS Ingredient 0
    HAS Ingredient 7
    HAS Ingredient 18
    HAS Ingredient 23

Category 3:
  Food 5
  Food 9
  Food 12
  Food 22
  Food 28
  Food 29
  Rule:
    HAS Ingredient 1
    HAS Ingredient 3
    NOT Ingredient 7

Category 4:
  Food 8
  Food 10
  Food 14
  Food 15
  Food 23
  Food 25
  Rule:
    NOT Ingredient 3
    NOT Ingredient 18

Conclusion

For the 30 by 30 matrix, the runtime was around 22 seconds on my laptop. While this runtime is acceptable, increasing the parameters even slightly leads to much longer runtimes, which is not that surprising for linear programming. A next step could be to explore some heuristics – the current algorithm looks for the optimal solution.

Acknowledgements