Primes
A challenge is always good. Ward Pond has a way of finding prime numbers.
My machine isn't as chunky as some, so my numbers are different. But...
I think Ward's performance largely comes from the fact that his starting point assumes he has found all the primes up to 31 already. So I adjusted his query a little, commenting out the section where he had all the previously done primes:
INSERT INTO dbo.Primes (Prime)
SELECT num
FROM dbo.nums
WHERE /*(num % 2 <> 0 OR num = 2)
AND (num % 3 <> 0 OR num = 3)
AND (num % 5 <> 0 OR num = 5)
AND (num % 7 <> 0 OR num = 7)
AND (num % 11 <> 0 OR num = 11)
AND (num % 13 <> 0 OR num = 13)
AND (num % 17 <> 0 OR num = 17)
AND (num % 19 <> 0 OR num = 19)
AND (num % 23 <> 0 OR num = 23)
AND (num % 29 <> 0 OR num = 29)
AND (num % 31 <> 0 OR num = 31)
AND */ num <> 1
AND num <= @Limit
..and having a different starting place for @Last
SET @Last = 1 --31
I set @Limit to 10,000, and it ran in 350 ms. His original query did it in only 60ms, which tells me that starting at 31 makes a massive difference, and is definitely a good idea. But let's see how another algorithm might compare, with both systems starting at 2.
I'm starting with a prepopulated auxiliary table of numbers, just as he did. I'm using the same counting mechanism as him.
DECLARE @Limit int;
SET @Limit = 10000;
DECLARE @Start datetime, @End datetime;
SET @Start = CURRENT_TIMESTAMP;
insert into dbo.rf_primesfound
select p.num
from dbo.nums n1
join
dbo.nums f
on n1.num between 2 and @limit
and f.num between 2 and sqrt(@limit)
and f.num <= @limit / n1.num
right join
dbo.nums p
on p.num = f.num * n1.num
where f.num is null
and p.num between 2 and @limit
SET @End = CURRENT_TIMESTAMP
SELECT @Start AS Start_time, @End AS End_time,
DATEDIFF(ms, @Start, @End) AS Duration,
COUNT(*) AS Primes_found, @Limit AS Limit
FROM dbo.rf_primesfound
select * from dbo.rf_primesfound
I set @Limit to 10,000, and my one ran in 170 ms.
Ok, so let's see how well they scale. Let's try for 100,000. Mine took 1673ms. His took 3243ms.
For 1,000,000, mine took 21 seconds. His took 55 seconds. But when I let him start at 31, his took about 5 seconds.
I think the difference comes down to the fact that Ward is going through the list one prime at a time, whereas I'm taking a set-based approach and just filtering them all out at once.
I feel like there would be ways of improving it even more, but will be someone else's challenge.
I definitely think the way to go is to start with the low-hanging fruit. Perhaps there are ways that I could improve mine to take a few steps to get the low-hanging fruit first... not today though.