arrays: add subtract/2 #25868

gechandesu · 2025-11-30T18:25:25Z

Added a generic function subtract for subtracting array elements and a test.

assert arrays.subtract([1, 2, 3, 4, 5, 6, 7], [3, 5, 6]) == [1, 2, 4, 7]

This is a convenient way to solve quite often task to find the difference between sets of elements. arrays.diff module is not suitable for this.

tankf33der · 2025-11-30T19:48:48Z

@gechandesu what if subtract? Check spelling.

gechandesu · 2025-11-30T20:00:44Z

@gechandesu what if subtract? Check spelling.

You're right, my english is bad...

tankf33der · 2025-11-30T20:08:40Z

My tests show this is an equivalent function. What do you think?

fn mike[T](a []T, b []T) []T {
	mut result := []T{cap: a.len}
	for elem in a {
		if elem !in b {
			result << elem
		}
	}
	return result
}

gechandesu · 2025-11-30T20:18:34Z

My tests show this is an equivalent function. What do you think?

fn mike[T](a []T, b []T) []T {
	mut result := []T{cap: a.len}
	for elem in a {
		if elem !in b {
			result << elem
		}
	}
	return result
}

This is much slower on large arrays. I've tested it elem !in b actually is a nested loop over b array.

In the worst case (O(n*m) vs O(n+m)):

/tmp $ v run arrays.v
 SPENT    10.685 ms in subtract
 SPENT  8108.553 ms in mike

Benchmark

import benchmark

fn mike[T](a []T, b []T) []T {
	mut result := []T{cap: a.len}
	for elem in a {
		if elem !in b {
			result << elem
		}
	}
	return result
}

fn subtract[T](a []T, b []T) []T {
	mut result := []T{cap: a.len}
	mut b_set := map[T]bool{}
	for elem in b {
		b_set[elem] = false
	}
	for elem in a {
		if elem !in b_set {
			result << elem
		}
	}
	return result
}

fn main() {
	iters := 1
	len := 100_000
	a := []int{len: len, init: 1}
	b := []int{len: len}
	mut bench := benchmark.start()
	for _ in 0 .. iters {
		_ := subtract(a, b)
	}
	bench.measure('subtract')
	for _ in 0 .. iters {
		_ := mike(a, b)
	}
	bench.measure('mike')
}

I'd like to avoid map, as it limits the input types, but the price is very low performance. Of course, I could add comptime type check and use a slower algorithm for anything that isn't supported as a map key, but for now I've decided to leave it as is.

tankf33der · 2025-12-01T04:43:57Z

For science, i would like to show where mike() is faster too.

import benchmark
import rand

fn mike[T](a []T, b []T) []T {
        mut result := []T{cap: a.len}
        for elem in a {
                if elem !in b {
                        result << elem
                }
        }
        return result
}

fn subtract[T](a []T, b []T) []T {
        mut result := []T{cap: a.len}
        mut b_set := map[T]bool{}
        for elem in b {
                b_set[elem] = false
        }
        for elem in a {
                if elem !in b_set {
                        result << elem
                }
        }
        return result
}

fn main() {
        iters := 1_000_000
        mut bench := benchmark.start()
        for _ in 0 .. iters {
                a := rand.bytes(rand.u8())!
                b := rand.bytes(rand.u8())!
                _ := subtract(a, b)
        }
        bench.measure('subtract')
        for _ in 0 .. iters {
                a := rand.bytes(rand.u8())!
                b := rand.bytes(rand.u8())!
                _ := mike(a, b)
        }
        bench.measure('mike')
}

gechandesu · 2025-12-01T07:43:44Z

Yes, map allocations may slow down on small arrays.

/tmp $ v run arrays.v
 SPENT 20909.702 ms in subtract
 SPENT 12172.886 ms in mike
/tmp $ v run arrays.v
 SPENT 20919.367 ms in subtract
 SPENT 12194.749 ms in mike
/tmp $ v run arrays.v
 SPENT 20893.922 ms in subtract
 SPENT 12174.510 ms in mike

I'll try to find input data on which the implementation without map starts to work slower than the implementation with map to write a combined version that will select the version of the algorithm based on the length of the arrays. It is okay?

jorgeluismireles · 2025-12-02T16:12:26Z

Or the word difference could be used as more appropiate for sets: https://en.wikipedia.org/wiki/Set_(mathematics)#Set_difference

gechandesu · 2025-12-02T16:42:54Z

True. I thought about it some more. I close this PR since the same thing is already implemented in datatypes.Set, I missed that. It's better than a new ugly function with inconsistent behavior.

arrays: add substract/2

13b4316

gechandesu changed the title ~~arrays: add substract/2~~ arrays: add subtract/2 Nov 30, 2025

fix spelling

00a7453

gechandesu marked this pull request as draft December 1, 2025 15:33

gechandesu closed this Dec 2, 2025

gechandesu deleted the arrays_substract branch December 2, 2025 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

arrays: add subtract/2 #25868

arrays: add subtract/2 #25868

gechandesu commented Nov 30, 2025 •

edited

Loading

Uh oh!

tankf33der commented Nov 30, 2025

Uh oh!

gechandesu commented Nov 30, 2025

Uh oh!

tankf33der commented Nov 30, 2025

Uh oh!

gechandesu commented Nov 30, 2025

Uh oh!

tankf33der commented Dec 1, 2025

Uh oh!

gechandesu commented Dec 1, 2025

Uh oh!

jorgeluismireles commented Dec 2, 2025

Uh oh!

gechandesu commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

arrays: add subtract/2 #25868

arrays: add subtract/2 #25868

Conversation

gechandesu commented Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tankf33der commented Nov 30, 2025

Uh oh!

gechandesu commented Nov 30, 2025

Uh oh!

tankf33der commented Nov 30, 2025

Uh oh!

gechandesu commented Nov 30, 2025

Uh oh!

tankf33der commented Dec 1, 2025

Uh oh!

gechandesu commented Dec 1, 2025

Uh oh!

jorgeluismireles commented Dec 2, 2025

Uh oh!

gechandesu commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gechandesu commented Nov 30, 2025 •

edited

Loading