| Answer: |
The answer posted here was prompted by this particular question:
>I want to get some non-repeated distinct records from a records pool >which, of course, contains only distinct records. For example, suppose >I have 100 distinct records in an array, I want to randomly pick 5 >records from these 100 records. These records should not be repeated. I >know there are lots of algorithm, but I don't want to use loops, unless >I really have to, to check if a record has been picked already. I want an >efficient method since it is a database problem.
There are two types of simple solutions. (And there are more complicated ones, for more complicated situations.)
They apply to these three situations:
In general, you should prefer the solutions in *numerical* order, though there are surely cases where 3 is better than 2. 1 is always the best, if it is applicable to your DB info.
(1) When you have a table that has an autoincrement or identity field that starts at a known number and goes to a known end number with no gaps. In the case of relatively static data (e.g., picking 5 ad links from a set of 100), you should be able to establish this kind of table. If so, then use solution number 1 (below).
(2) For tables where such a gap-less numbering system is impossible or impractical (e.g., a "live" data system) but where the total number of records is a reasonable number (certainly 100 is viable, perhaps even 1000 if you are desperate), see solution number 2 (below).
(3) When you have a very small table and need only a few fields from each record to build your HTML page. (As an arbitrary figure, lets say the product of record count times fields needed is no more than 200 to 2000 for this scheme, depending upon how much memory your system has and how heavily loaded it is.) If this matches your situation, use solution # 3, way below.
If none of those situations obtain, then the best solution is probably to build ANOTHER table with the characteristics described in situation 1, where the sequentially numbered records simply contain references to the records in the main table. If the main table is dynamically updated, then you will want to rebuild the auxiliary table from time to time.
Anyway....
Solution #1:
If possible, keep the count of the number of record someplace handy, outside the DB. If not, find the count via something like
Set RS = conn.Execute("Select Count(*) AS RecCount From table") recCount = RS("RecCount")
|
and then in either case do:
CONST choiceCount = 5 ' or however many you want CONST firstRecNum = 1 ' adjust if not true for your table choices = "" chosen = 0 Do While chosen < choiceCount choose = Int( Rnd * recCount ) + firstRecNum chTest = "#" & choose & "#" If InStr( choices, chTest ) = 0 Then choices = choices & chTest chosen = chosen + 1 End If Loop choices = Replace( choices, "##", "," ) choices = Replace( choices, "#", "" ) SQL = "SELECT fld1,fld2 FROM table " _ "WHERE sequenceField IN (" & choices & ")" Set RS = conn.Execute( SQL )
|
Do you see the tricks? I insist on finding the record number surrounded by #...# in the string of remembered record numbers, so that I won't accidentally get a match on 1234 for newly chosen number 23 (because 1234 will be in the choices string as "#1234#" and I will be searching for "#23#"). Then I change all the adjacent ## characters to single commas and change the leading and trailing single # characters to nothing. (So the choices string that I built up to "#517##23##765##92##412#" becomes "517,23,765,92,412".)
Finally, the IN phrase in the WHERE clause will now get us *exactly* the 5 records we wanted. We hit the DB as lightly as we can!
Solution #2:
This assumes you have relatively few records. Say up to 1000. It also assumes that you have a unique primary key field!
' if the primary key is not numeric! DELIM = Chr(39) & "," & Chr(39) **or ** ' if the primary key *is* numeric DELIM = ","
Set RS = conn.Execute("SELECT primaryKeyField FROM table") allKeys = RS.GetRows ' convert RS to an array! recCount = UBound( allKeys, 2 ) + 1 ' a lot of similarity to solution #1... CONST choiceCount = 5 ' or however many you want choices = "" keys = "" chosen = 0 Do While chosen < choiceCount choose = Int( Rnd * recCount ) chTest = "#" & choose & "#" If InStr( choices, chTest ) = 0 Then choices = choices & chTest keys = keys & DELIM & allKeys(0,choose) chosen = chosen + 1 End If Loop
' if the primary key is not numeric! keys = Mid( keys, 3 ) & Chr(39) *or ** ' if the primary key *is* numeric keys = Mid( keys, 2 ) SQL = "SELECT fld1,fld2 FROM table " _ "WHERE primaryKeyField IN (" & keys & ")" Set RS = conn.Execute( SQL )
|
As you can see, this is really the same solution as #1 except that instead of using the simple sequential numbers we use the record number in the array to pick the primary key and *then* pick the entirety of the needed records.
Solution #3:
But if you have relatively small records and relatively few of them, then why not eliminate the two-step process of solution #2? Okay, I give up...why not:
RANDOMIZE ' don't forget this one! ' select *only* the fields you must have, for efficiency Set RS = conn.Execute("SELECT fld1,fld2,fld3 FROM table") allRecords = RS.GetRows ' convert RS to an array! recCount = UBound( allRecords, 2 ) + 1 ' still similarity to solution #1... CONST choiceCount = 5 ' or however many you want choices = "" chosen = 0 Do While chosen < choiceCount choose = Int( Rnd * recCount ) chTest = "#" & choose & "#" If InStr( choices, chTest ) = 0 Then choices = choices & chTest chosen = chosen + 1
' Then RIGHT HERE put out the HTML that ' is appropriate for the chosen record! info1 = allRecords( 0, choose ) info2 = allRecords( 1, choose ) info3 = allRecords( 2, choose ) Response.Write "<SOMEHTMLTAG>" & info1 ..... ... End If Loop
|
So the common trick to all of these is the use of a string to "remember" the records already chosen and then the search within that string via the InStr function to check for already chosen records. And then we just make sure that we use delimiters that will ensure we don't get any "false matches" and the rest is easy.
For more information on selecting random records from a database table, check out these 4Guys articles: Returning a Random Number of Database Records Choosing a Random Record from a Recordset Getting a Random Record Using a Stored Procedure Returning Rows in Random Order
|