|
Did you mean to come here? My blog is now at http://msmvps.com/blogs/robfarley
I've changed my algorithm a bit, so that it starts with the low-hanging fruit.
insert into dbo.rf_primesfound (num) select 2 union all select 3 union all select 5 union all select 7 union all select 11 union all select 13 union all select 17 union all select 19 union all select 23 union all select 29 union all select 31;
DECLARE @BigLimit int; SET @BigLimit = 1000000;
DECLARE @Limit int; SET @Limit = 32; DECLARE @OldLimit int; SET @OldLimit = 32;
DECLARE @Start datetime, @End datetime; SET @Start = CURRENT_TIMESTAMP;
while @limit < @BigLimit begin select @Oldlimit = @limit, @limit = @limit * @limit; if @limit > @BigLimit set @limit = @BigLimit;
insert into dbo.rf_primesfound select p.num from dbo.nums n1 join dbo.rf_primesfound f on n1.num between 2 and @limit / f.num right join dbo.nums p on p.num = f.num * n1.num where f.num is null and p.num > @Oldlimit and p.num <= @limit;
end
SET @End = CURRENT_TIMESTAMP
SELECT @Start AS Start_time, @End AS End_time, DATEDIFF(ms, @Start, @End) AS Duration, COUNT(*) AS Primes_found, @Limit AS Limit FROM dbo.rf_primesfound
select * from dbo.rf_primesfound
This gets down to 7 seconds, which isn't quite as good as Ward's 5. I do wonder what other improvements I could make, but I have a few other things to sort out this weekend (like my talk for the SQL Code Camp next weekend).
A challenge is always good. Ward Pond has a way of finding prime numbers.
My machine isn't as chunky as some, so my numbers are different. But...
I think Ward's performance largely comes from the fact that his starting point assumes he has found all the primes up to 31 already. So I adjusted his query a little, commenting out the section where he had all the previously done primes:
INSERT INTO dbo.Primes (Prime) SELECT num FROM dbo.nums WHERE /*(num % 2 <> 0 OR num = 2) AND (num % 3 <> 0 OR num = 3) AND (num % 5 <> 0 OR num = 5) AND (num % 7 <> 0 OR num = 7) AND (num % 11 <> 0 OR num = 11) AND (num % 13 <> 0 OR num = 13) AND (num % 17 <> 0 OR num = 17) AND (num % 19 <> 0 OR num = 19) AND (num % 23 <> 0 OR num = 23) AND (num % 29 <> 0 OR num = 29) AND (num % 31 <> 0 OR num = 31) AND */ num <> 1 AND num <= @Limit
..and having a different starting place for @Last
SET @Last = 1 --31
I set @Limit to 10,000, and it ran in 350 ms. His original query did it in only 60ms, which tells me that starting at 31 makes a massive difference, and is definitely a good idea. But let's see how another algorithm might compare, with both systems starting at 2.
I'm starting with a prepopulated auxiliary table of numbers, just as he did. I'm using the same counting mechanism as him.
DECLARE @Limit int; SET @Limit = 10000;
DECLARE @Start datetime, @End datetime; SET @Start = CURRENT_TIMESTAMP;
insert into dbo.rf_primesfound select p.num from dbo.nums n1 join dbo.nums f on n1.num between 2 and @limit and f.num between 2 and sqrt(@limit) and f.num <= @limit / n1.num right join dbo.nums p on p.num = f.num * n1.num where f.num is null and p.num between 2 and @limit
SET @End = CURRENT_TIMESTAMP
SELECT @Start AS Start_time, @End AS End_time, DATEDIFF(ms, @Start, @End) AS Duration, COUNT(*) AS Primes_found, @Limit AS Limit FROM dbo.rf_primesfound
select * from dbo.rf_primesfound
I set @Limit to 10,000, and my one ran in 170 ms.
Ok, so let's see how well they scale. Let's try for 100,000. Mine took 1673ms. His took 3243ms.
For 1,000,000, mine took 21 seconds. His took 55 seconds. But when I let him start at 31, his took about 5 seconds.
I think the difference comes down to the fact that Ward is going through the list one prime at a time, whereas I'm taking a set-based approach and just filtering them all out at once. I feel like there would be ways of improving it even more, but will be someone else's challenge.
I definitely think the way to go is to start with the low-hanging fruit. Perhaps there are ways that I could improve mine to take a few steps to get the low-hanging fruit first... not today though.
There are all kinds of ways that an auxiliary table of numbers can be useful. Just today I used one to generate a bunch of test data.
If you want your own table of nums, try this:
create table dbo.nums (num int not null primary key); insert into dbo.nums (num) select 1; insert into dbo.nums (num) select 2; go
create procedure dbo.topup(@max int) as begin if ((select count(*) from nums) < @max) begin while ((select count(*) from nums) < @max/2) begin insert into dbo.nums (num) select num + count(*) over() from nums; end;
insert into dbo.nums (num) select n from (select num + count(*) over() as n from nums) t where n <= @max; end; end;
This will start your table off very small, but you can call the topup procedure every time you need it. For example, if you know you're going to need to have to it go as high as the number of rows in the blah table, do:
declare @newmax int select @newmax = count(*) from blah exec dbo.topup @newmax
--before your query that uses nums.
From the South Australia Budget (Government, not User Groups)...
http://www.treasury.sa.gov.au/custom/files/200609/2006-2007%20Budget%20Statement.pdf Chapter 2, page 2.22. (page 50 of 236)
"A significant savings measure arising from the Review of Priorities is the proposed abolition of the Department for Administrative and Information Services....."
I wonder what kind of effect this will have on SA.
I listened to Nick Randolph's podcast this morning about user-groups. The first of his MSDev series at ThePodcastNetwork. Nick's a great guy, and it's great that he's picked up the mantle of doing this show.
During the show, it was said that most people at TechEd already know all about the Vista Roadmap, etc. That they've read that on people's blogs, etc.
But I think that the larger proportion of people are there because they've been SENT. But people GO to user-groups. The TechEd crowd may have put their hand up to go, but essentially, they're there because their employers are paying for it. They are not the people who are keeping up-to-date with technology. They go to learn. They go as a training exercise. A smaller proportion of TechEd attendees, of which anyone involved in user-groups would fall into, are there because they want to network, because they want to meet the experts. They go to meet. They go as a networking exercise.
Developers have a very different perspective on user groups to IT Pros and Managers. Managers are often wary of user groups because they fear their staff might get to know other employers and be taken away from them. But IT Pros are far more interested in best-practice than the creativity that goes with Developers. IT Pros are great people, but they often fall into the brick-layer analogy. Brick-layers are interested in the best way to lay bricks, and are certainly interested in the new ways of doing that. But they're not into the social scene like the architects, who are always looking for inspiration from their peers to improve their creativity.
And I think those are two of the keys words.
Inspiration and Creativity. Compared with Determination and Reliability.
Developers need the inspiration to be creative to do their job really well. IT Pros need the determination to be reliable. Of course, Developers need reliablity as well, but assuming that they meet the corporate standards of their workplace, the thing that determines who are the kings of the developer-world is the creativity. Similar attributes may include ingenuity and skill. These are all things that people get from sharpening against each other. But IT Pros don't need this as much. They need to have the set of rules to be able to implement their systems consistently, having the determination to resist creativity, and choose reliability instead.
Creativity and Reliability conflict with each other in many ways. It's the difference between the romantic and the faithful. The two can work together very well, but in the same way that young lovers will be disappointed with years of fidelity, missing the excitement of romance, the creative can be disappointed with the idea of doing something the tried and tested, reliable way. And people who are more concerned with reliability may shy away from choosing a creative solution, for fear that it may not have the same robustness as the way they used to do things.
SQL Server is an interesting tool. It is a back-end product. It is the realm of data storage. It is not a user-facing product. And yet, it is a product which involves a large amount of creativity. A database administrator (DBA) must choose the reliable option. A database developer (DBD) must also choose the reliable option. But a DBD is much closer in mindset to the Developer than the DBA. The DBA is much closer in mindset to the IT Pro than the Developer. SQL people have the two characters sitting on our shoulders. The cartoon devil and angel. The IT Pro and the Developer. The brick-layer and the artist. I'm not saying which is the devil and which is the angel, I'll leave that up to you.
The problem is that DBDs are not just developers. Typically, DBDs are part-DBA, part-Developer. This then presents an internal struggle. We want to be able to have the reliable, robust solution, and know the best ways of achieving this. And yet we want to find the creative ways to manipulate the data. If you can in some way separate the DBA aspects from the DBD role (of course you can't), then you find the DBD is trying to bend the rules, trying to work to find the exquisite query, a thing of beauty which can get the perfect resultset with performance never seen. It becomes the role of aeronautical engineer, working in the wind tunnel to find a beautiful design for a race car to get greater speed. The creativity is certainly present.
And yet when pushed, the DBD will insist on the reliability far more so than a standard Developer. And this is where the IT Pro aspect comes out. The most beautiful query in the world is ugly if it is not robust. Some Developers would have the same attitude toward their own code. But I think a DBD is more passionate about reliability.
Earlier I used the angel v devil analogy. There are many stories of the artist who sold his soul to the devil. In many ways, a SQL person is like the drummer in the band. Many musicians don't consider the drummer to be a real musician. As a pianist and guitar player, I have told jokes like "What do you call a guy who hangs around with musicians? The drummer!" But of course, the drummer is a musician. He doesn't just provide the backbone and stability of the music, he determines the flavour. It's the drummer's creativity that sets the mood, the drive, the passion, more than the soloist lead guitarist or saxophonist. Just please don't tell my friend Dave Branton of justdrums.biz that I've said this. Despite the jokes, I know that when Dave plays, the music is different to when an ordinary drummer is there. The same applies for someone who really knows their databases. A good database can make or break an application.
It's interesting that the SQL groups in Australia run under a common banner, headed by Greg Linwood. We have a common site, we have sponsors at the top level, and that can then give financial assistance to the groups. It's particularly pertinent at the moment, as Microsoft are really pulling back on their budget for sponsoring the groups. The Developer community haven't been able to establish a common group in this way. And perhaps it's because of the creativity issue. Perhaps the fact that the SQL groups are able to work together is a reflection of our IT Pro nature. We're keen to do our own thing, but we also appreciate fully the requirement to adhere to standards.
Personally, I'm more DBD than DBA. I'm much more interested in going to a Developer group than a Sysadmin group. But yet, I run the SQL Group, and try to cater for both camps. I can assure you though - those people who make the effort to go to SQL Code Camp or TechEd are the DBDs in the group, not the DBAs.
Recently, my blog post entries haven't been appearing on TTB. But I can see a bunch appeared last night at 10:13pm. So I guess something got fixed somewhere.
From the MostRecent view (on various pages), I can see my site wasn't the only one that was being missed. Lots of entries from people all at the same time.
If you don't keep your eye on the TTB site, I really recommend you do. The fact that people in the community can give a 'thumbs up' to various posts really makes it a great resource. Thanks Frank, Mark, etc...
A few months ago, Roslyn noticed something on the TV she liked the look of. Not learning from last time, I decided to get it for her for her birthday (today). She likes it. So that's good. But it's still a kettle.
It's the Russell Hobbs Thermocolor Kettle. The slogan says "as the temperature changes, so does the colour!"
But it's not true.
Sadly, there are clearly those that do. But not this model. Not the one I bought Roslyn. This one only changes colour to let you know what you're doing with it. So it's purple while it's boiling, and as soon as it's done boiling, it switches back to blue. We had to specifically tell the kids "Just because it's blue, it doesn't mean it's cold."
She still likes it - but I'm going to call up Russell Hobbs and have a bit of a chat about it. It certainly doesn't do what it says on the box. It does look good though, and I'm not sure if I actually want a refund. It just doesn't have the massive safety feature associated with it.
Interestingly, when I do a search on the Argos site (shop by catalog place in the UK), I can find the kettle I bought, described not as a thermocolour one, but as a Glow Kettle.
The problem is that when you are somewhere that requires you to specify your WiFi connection's IP address, you need t change your settings from using DHCP. And you get sick of changing it manually! Like Dave Gardiner says, it would be useful to have a network profile switcher.
A friend at work recommends http://www.hudsonmobile.com/products/ipdash/index.html. Says it's very good - and it certainly looks it.
Another option would be to roll your own. The registry contains the details. I use TRE registry editor, which tells me that HKLM\Comm\tiacxwln1\Parms\TCPIP contains the useful stuff for this. So it should be easy to write an app which alters the entries there, and prompts you to save the settings as a profile you could switch to later.
Of course, writing one isn't my highest priority right now, but I'll try to do it some time...
Steve Jones from sqlservercentral.com posted an editorial today asking about what you'd get someone to do for you if you could get a helper. He wants a chef/cook to be able to provide regular meals for him.
Personally, I'd like someone in my life to help me keep accountable for all the things that I'd like to do. I would return the favour, and regularly ask them how they're going with achieving stuff. I heard something recently (was it on Hanselminutes?) about the idea of detailing what the next step is for each goal you have. So, the next step in fixing the fence could be counting the number of palings you need to buy, after you've done that, you update the list with the next step. You also have things dated so that you can see the things that you're being slack on.
But of course, this is so much easier if you have someone who will ask you how you're going with it. I will be asking a few people to do this with me, and I hope it helps. I'm not bad at getting things done, but I would like to be better.
And of course, this is probably the type of thing that dailydevelopers.com would like. :)
I think everyone needs a mentor in their life, and everyone needs people in their life who will keep them accountable to the things they want to be held accountable for. Of course, everyone has different things that they're trying to achieve. It might be dieting, quitting smoking, blogging more, keeping track of projects better, etc. But it's so much easier to do with help. Why not work out who you will help with things, and who you will ask for help from?
Yet another interesting post from Lemphers David. I know some people who consider MVP status something to be achieved. It certainly seems to have the highest ratio of usage on newsgroup forums - and by ratio I'm meaning "most people who have MVP status mention the fact". People don't tend to write what degrees they have, or their professional organisation status, and although many people mention their Microsoft certifications, the proportion of Microsoft certified people that do isn't nearly as high as the number of MVPs that have "MVP" written after their name. So clearly it's something that people ought to be proud of.
But as Dave (and Darren) clearly state (and I agree), it shouldn't be something that people pursue. It's not an achievement, it's an award. I like Kate Winslet's line in "Extras", when she says "If you do a film about the Holocaust, you're guaranteed an Oscar. That's why I'm doing it - Schindler's bloody List..." Sometimes I think people try to do things that make them get noticed for MVP status.
MVP status isn't like that, or at least shouldn't be. When I look at the MVPs I know, they're largely really helpful people, and are genuinely nice. I think if you're a scumbag, you're not actually that likely to get one. Presumably the fact that Microsoft have to know who you are, and ask your peers a bunch of questions too... well, this should filter out people who don't have the right personality.
I don't know if I'm going to get MVP status one day or not. There's also a part of me that worries that if I do, I might not live up to the standard of MVPs. These guys tend to know a lot, and they're almost always people that are online ALL the time. I like to be helpful, and will always do what I can to help colleagues, peers, whoever with their questions -but I don't spend hours each night looking for questions to answer on newsgroups. My blog isn't even that much of a 'resource', despite what I tell the people in my usergroup.
So what do I think of the MVP program? Well, I think it's good. Really good. I think it's a great way of recognising people who are _valued_ by Microsoft. I love the stories of people like Dave Lemphers, Angus Logan and Rocky Heckman, who were MVPs that got hired by Microsoft. That's a great endorsement of the MVP program, the fact that Microsoft likes these guys enough to actually hire them!
Following up on my last post... I know you could easily put together a CLR function that uses regular expressions to pull out the numbers, and this may well be faster. But it's a nice exercise to do it in T-SQL. :)
Steve Koop gave a great presentation at the user-group yesterday about upsizing from Access to SQL. One of the things he mentioned was that there is no equivalent for the Val() function in SQL. To fill in the non-Access-savvy, VAL('abad2j3lk2345') = 232345. It takes any numbers in a string, and converts them into an integer.
Here's my equivalent for SQL2005. Let's start by using our old favourite, the auxiliary table of numbers. I'll assume that we're doing defining @str as a varchar(1000) or nvarchar(1000). If you prefer varchar to nvarchar, then just change the strings accordingly.
So, with the auxiliary table of numbers (which goes from 1 to 1000), we can grab each character.
SELECT num, SUBSTRING(@str,num,1) AS ch FROM nums WHERE num <= LEN(@str)
Great. So now let's filter this to only pick up the ones that are numbers.
AND SUBSTRING(@str,num,1) BETWEEN N'0' AND N'9'
Let's add a rownum field to the results, but reversed (you'll see why in a second).
SELECT num, SUBSTRING(@str,num,1) AS ch, ROW_NUMBER() OVER (ORDER BY num DESC) as rn FROM nums WHERE num <= LEN(@str) AND SUBSTRING(@str,num,1) BETWEEN N'0' AND N'9'
But actually want I want is 10^rn, so let's do that instead. And while we're at it, let's cast that 'ch' field to an integer.
SELECT num, CAST(SUBSTRING(@str,num,1) AS INT) AS ch, POWER(10,ROW_NUMBER() OVER (ORDER BY num DESC)-1) as pwr FROM nums WHERE num <= LEN(@str) AND SUBSTRING(@str,num,1) BETWEEN N'0' AND N'9' You can see where I'm going now... let's multiply 'ch' and 'pwr', and sum them. We need to do this using a derived table, because you can't use a ranking function inside an aggregate.
SELECT SUM(ch * pwr) FROM (SELECT CAST(SUBSTRING(@str,num,1) AS INT) AS ch, POWER(10,ROW_NUMBER() OVER (ORDER BY num DESC)-1) AS pwr FROM nums WHERE num <= LEN(@str) AND SUBSTRING(@str,num,1) BETWEEN N'0' AND N'9' ) c
So now, throw this into a user-defined function, and you have a VAL() equivalent.
For those of you who are coming to Wagga for the SQL Code Camp (less than four weeks away), you'll hear this tip then. But for the rest of you, here's one of the great uses for row_number() in SQL2005.
Suppose you have a table of data, which includes some duplicates. It happened recently for a guy who posted to the SQLDownUnder mailing list. You want to get rid of duplicates, and keep only the first row for each subset of fields. For example, in the mailing list question, the example was that he wanted to filter to a single field, which I will call idfield, which contained duplicates. It's worth noting here that there is no primary key on the table. If there were, life would be much easier.
One option (without row_number()) is to use min() or max() for each field. Something like this:
select idfield, min(field1), max(field2) from dbo.sometable group by idfield
But some types, such as guids, cannot be used with min() and max(). So for that, let's try:
select t1.idfield, (select top 1 t2.field1 from dbo.sometable t2 where t2.idfield = t1.idfield), max(t1.field2) from dbo.sometable t1 group by t1.idfield
This will do the trick, but it makes life tough for checking that you get fields from the same row. After all, max(field1) and max(field2) might be from different rows. So you can use top with order by to get the right row.
select t1.idfield, (select top 1 t2.field1 from dbo.sometable t2 where t2.idfield = t1.idfield order by field1, field2, field3, field4), (select top 1 t2.field2 from dbo.sometable t2 where t2.idfield = t1.idfield order by field1, field2, field3, field4) from dbo.sometable t1 group by t1.idfield
And you can see, we're getting pretty ugly. So let's try with row_number(). If we partition over the field that we want to become unique, then our row_number() field starts again from 1 each time it changes. This is ideal, because if we filter to rows which have row_number() as 1, we get exactly what we want!
select * from (select *, row_number() over (partition by idfield order by idfield, field1, field2) as rn from dbo.sometable) t where t.rn = 1
Lovely. Of course, you wouldn't use "select *", but you get the drift. :)
A while back, Jay Furr asked me what my leitmotif was. Now I find myself talking to Lemphers about what his should be.
For those of you who haven't ever come across the word, it refers to that piece of music which identifies you. Like in Peter and the Wolf, how whenever it mentions the duck, it plays a certain piece of music. But actually more like boxers, who have a particular track played when they walk in.
I still don't know what mine should be. But Lemphers wants to have "You can't touch this". I'm thinking that every time he walks into the building, they should play it through some loud speakers. Then everyone will know that he has arrived.
So I'm talking to Dave Lemphers (of "Time spent here you will never get back" fame) on IM, and one of the things we occasionally talk about is how to develop presenters, and I (bizarrely) suggested having "presentation awards". The idea would be to have our own version of the Perrier awards as a way of encouraging and developing presenters (at user-groups, in podcasts, etc).
Of course, it raises the question of why to present? I guess it's similar to the question of why to blog?
On a personal level, there are a few obvious ones.
- Shows other people that you know your stuff
- Develops your status in the community
- Deals with that fear of public-speaking
- Increases your consulting price
- Stops you from having to apply for work, because Readify might approach you
Naturally, your employer will be keen for all of the above (especially if they're Readify), and therefore will encourage you to present as well. But for some reason, the number of people who actually present is quite small. So that's where some annual awards come in. TechEd in Europe have Speaker Idol this year. This is a great idea, and perhaps we'll have something like that in Australia next year. But what about the user-groups? How can we get more people speaking? As someone who runs a smaller-city user-group, I'd like to have a fund to bring people across from interstate more often, or to send the people in my group to other states. But what else can I do? Please post ideas...
First Steve Irwin, now Peter Brock. Who next, and what will they be doing?
I wasn't sure if these were going to become available, but it seems they have!
http://blogs.msdn.com/charles_sterling/archive/2006/09/05/740899.aspx for all the links. The DB ones are about databases, and IW are about Office and Sharepoint (IW = Information Worker). Those are the two that I proctored for this year.
@@datefirst tells you what SQL says the first day of the week is (see the linked MSDN page for more). 1=Monday, 7=Sunday. The default for US English is 7. I guess the people who decide when movies come out have it set to 4 or 5.
But, should you let the start of the week be determined by @@datefirst, so that if policy on the Sunday/Monday argument changes you can just alter @@datefirst and let it be done?
Or should you make your code enforce the start of the week as per the policy, regardless of what @@datefirst is set to, with code such as:
declare @df_orig int select @df_orig = @@datefirst --This is so we can reset it at the end declare @df int set @df = 1 while (@df <= 7) begin set datefirst @df select @@datefirst as df, datepart(dw,'20060907') as dw, (@@datefirst + datepart(dw,'20060907')) % 7 as fixed set @df = @df + 1 end set datefirst @df_orig --Resetting it
So, (@@datefirst + datepart(dw,'20060907')) % 7 will always give 5.
If I want to adjust a date so that I count the number of weeks based on a Thursday start, I can do this with:
datediff(week, '19900101',dateadd(day, @@datefirst - 4, '20060907')) (where '19900101' is an arbitrary date - if you want to start counting at one somewhere else, then be my guest. The day of the week of this date is not important - it just marks the week in which our 'weeknumber' will be zero)
Why is this useful?
For example - my typical week starts on Sunday. I get frustrated by calendars that start on Mondays, it always throws out what day I think things are. If only I could change the calendars that my kids' school sends us...
But pay-wise, the fortnight starts on a Monday. So I have a sample fortnight start in the system, and I use datediff(day,@fortnightstart,@d)/14 to determine the fortnight that something fits into.
For people paid by the month, the month actually starts on the 25th of each month. So for that, I subtract the offset and strip the day off.
So for any code that I write, I want to know what fixes the start of the week/month. If it's a company-wide policy, then @@datefirst could make sense (and it makes calculations easier). If it's fixed by something else, then it's worth making sure that you have this defined somewhere, and use that. Like in a table of system-wide config values or something. Grab @fortnightstart from there before you need it, or @paymonthoffset, or whatever.
You can also find this at: http://sqlblogcasts.com/blogs/rob_farley/articles/999.aspx
You don't ever want to see these messages in short succession:
Dave has sent you a message: brb, toilet Dave has just signed out. Dave has just signed in from a mobile device.
(apologies to the Daves I know - I'm sure this wouldn't be one of you)
|
|