Rob Farley

03 January 2007

Reminder of change-of-address

Quick reminder to anyone still using this blog that new posts are no longer appearing here. You'll need to check http://msmvps.com/blogs/robfarley for new posts.

08 October 2006

I'm looking to move my blog across to msmvps.com - if I can merge the two somehow, then I will do so, but for the time being please start also checking msmvps.com/blogs/robfarley . The RSS feed will remain at http://feeds.feedburner.com/robfarley/ .

05 October 2006

Fellow MVP (name dropping)

So far, one of the best things about being an MVP is access to the MVP newsgroups, where some of the heroes of SQL hang out. I'm not sure I'll fit in, but it is very flattering to have someone like Adam Machanic blog that I showed him something (of course, I'm sure this is a once-off).

04 October 2006

To be colleagued

I think there needs to be a new word in the dictionary, to describe those times when a colleague comes over to your desk and takes time from you. Actually, it might not be a colleague, and they might just take time from you in some other way, like via the telephone or an Instant Messaging system.

Obviously when someone colleagues you, it's urgent to them, and you have to treat it urgently too. It stops what you're doing for a few moments. It's not a bad thing - sometimes it can be just what you need to get a fresh perspective on the work you're doing, but it's definitely something which I think needs a verb. "Interrupted" doesn't describe well, because it has negative connotations. And besides, you could get colleagued for a long period if someone drags you into a meeting.

02 October 2006

MVP Photo

This is a decent list. Darren, Grant, Greg & Greg are all really great guys. I feel honoured to be a part of this crowd, and can't help but think that there are other names that should be there too.

I need to get a decent photo of myself sorted. Perhaps there will be someone at Wagga who is good at taking photos.

Rob Farley - MVP

More letters... These ones are certainly very nice. :)

I'll write more when I have time. I'm just leaving for the public-holiday church picnic.

30 September 2006

More on primes

I've changed my algorithm a bit, so that it starts with the low-hanging fruit.

insert into dbo.rf_primesfound (num)
select 2 union all
select 3 union all
select 5 union all
select 7 union all
select 11 union all
select 13 union all
select 17 union all
select 19 union all
select 23 union all
select 29 union all
select 31;

DECLARE @BigLimit int;
SET @BigLimit = 1000000;

DECLARE @Limit int;
SET @Limit = 32;
DECLARE @OldLimit int;
SET @OldLimit = 32;

DECLARE @Start datetime, @End datetime;
SET @Start = CURRENT_TIMESTAMP;

while @limit < @BigLimit
begin
select @Oldlimit = @limit, @limit = @limit * @limit;
if @limit > @BigLimit set @limit = @BigLimit;

insert into dbo.rf_primesfound
select p.num
from dbo.nums n1
join
dbo.rf_primesfound f
on n1.num between 2 and @limit / f.num
right join
dbo.nums p
on p.num = f.num * n1.num
where f.num is null
and p.num > @Oldlimit
and p.num <= @limit;

end

SET @End = CURRENT_TIMESTAMP

SELECT @Start AS Start_time, @End AS End_time,
DATEDIFF(ms, @Start, @End) AS Duration,
COUNT(*) AS Primes_found, @Limit AS Limit
FROM dbo.rf_primesfound

select * from dbo.rf_primesfound

This gets down to 7 seconds, which isn't quite as good as Ward's 5. I do wonder what other improvements I could make, but I have a few other things to sort out this weekend (like my talk for the SQL Code Camp next weekend).

Primes

A challenge is always good. Ward Pond has a way of finding prime numbers.

My machine isn't as chunky as some, so my numbers are different. But...

I think Ward's performance largely comes from the fact that his starting point assumes he has found all the primes up to 31 already. So I adjusted his query a little, commenting out the section where he had all the previously done primes:

INSERT INTO dbo.Primes (Prime)
SELECT num
FROM dbo.nums
WHERE /*(num % 2 <> 0 OR num = 2)
AND (num % 3 <> 0 OR num = 3)
AND (num % 5 <> 0 OR num = 5)
AND (num % 7 <> 0 OR num = 7)
AND (num % 11 <> 0 OR num = 11)
AND (num % 13 <> 0 OR num = 13)
AND (num % 17 <> 0 OR num = 17)
AND (num % 19 <> 0 OR num = 19)
AND (num % 23 <> 0 OR num = 23)
AND (num % 29 <> 0 OR num = 29)
AND (num % 31 <> 0 OR num = 31)
AND */ num <> 1
AND num <= @Limit

..and having a different starting place for @Last

SET @Last = 1 --31

I set @Limit to 10,000, and it ran in 350 ms. His original query did it in only 60ms, which tells me that starting at 31 makes a massive difference, and is definitely a good idea. But let's see how another algorithm might compare, with both systems starting at 2.

I'm starting with a prepopulated auxiliary table of numbers, just as he did. I'm using the same counting mechanism as him.

DECLARE @Limit int;
SET @Limit = 10000;

DECLARE @Start datetime, @End datetime;
SET @Start = CURRENT_TIMESTAMP;

insert into dbo.rf_primesfound
select p.num
from dbo.nums n1
join
dbo.nums f
on n1.num between 2 and @limit
and f.num between 2 and sqrt(@limit)
and f.num <= @limit / n1.num
right join
dbo.nums p
on p.num = f.num * n1.num
where f.num is null
and p.num between 2 and @limit

SET @End = CURRENT_TIMESTAMP

SELECT @Start AS Start_time, @End AS End_time,
DATEDIFF(ms, @Start, @End) AS Duration,
COUNT(*) AS Primes_found, @Limit AS Limit
FROM dbo.rf_primesfound

select * from dbo.rf_primesfound

I set @Limit to 10,000, and my one ran in 170 ms.

Ok, so let's see how well they scale. Let's try for 100,000. Mine took 1673ms. His took 3243ms.

For 1,000,000, mine took 21 seconds. His took 55 seconds. But when I let him start at 31, his took about 5 seconds.

I think the difference comes down to the fact that Ward is going through the list one prime at a time, whereas I'm taking a set-based approach and just filtering them all out at once.
I feel like there would be ways of improving it even more, but will be someone else's challenge.

I definitely think the way to go is to start with the low-hanging fruit. Perhaps there are ways that I could improve mine to take a few steps to get the low-hanging fruit first... not today though.

29 September 2006

Ownership chaining

I've put an article together about ownership chaining at http://sqlblogcasts.com/blogs/rob_farley/articles/1140.aspx

Auxiliary table of numbers

There are all kinds of ways that an auxiliary table of numbers can be useful. Just today I used one to generate a bunch of test data.

If you want your own table of nums, try this:

create table dbo.nums (num int not null primary key);
insert into dbo.nums (num) select 1;
insert into dbo.nums (num) select 2;
go

create procedure dbo.topup(@max int) as
begin
if ((select count(*) from nums) < @max)
begin
  while ((select count(*) from nums) < @max/2)
  begin
   insert into dbo.nums (num) select num + count(*) over() from nums;
  end;

  insert into dbo.nums (num)
  select n
  from (select num + count(*) over() as n from nums) t
  where n <= @max;
end;
end;

This will start your table off very small, but you can call the topup procedure every time you need it. For example, if you know you're going to need to have to it go as high as the number of rows in the blah table, do:

declare @newmax int
select @newmax = count(*) from blah
exec dbo.topup @newmax

--before your query that uses nums.

Rob Farley - You could win from here!

Links

Blogs