Call Search
     

New to Ham Radio?
My Profile

Community
Articles
Forums
News
Reviews
Friends Remembered
Strays
Survey Question

Operating
Contesting
DX Cluster Spots
Propagation

Resources
Calendar
Classifieds
Ham Exams
Ham Links
List Archives
News Articles
Product Reviews
QSL Managers

Site Info
eHam Help (FAQ)
Support the site
The eHam Team
Advertising Info
Vision Statement
About eHam.net

   Home   Help Search  
Pages: Prev 1 ... 15 16 17 18 19 [20] 21 22 23 24 Next   Go Down
  Print  
Author Topic: LOTW  (Read 73221 times)
KY6R
Member

Posts: 3298


WWW

Ignore
« Reply #285 on: December 12, 2012, 08:07:54 AM »

If the ETL is a bottleneck too then that will have to be redesigned I guess...

If ETL is the bottleneck and they have row level locking - then they can download Pentaho for Free (the Community edition), and it would be a total piece of cake to automatically multi-thread ETL and get much faster throughput.

Of course - I am being a big armchair quarterback  Grin. They might already have Row Level Locking and Multi-threaded ETL. If they do - then adding hardware and tuning the parallelism would do the trick. Inserting LOTW files should be pretty easy, actually.
Logged
N7SMI
Member

Posts: 373




Ignore
« Reply #286 on: December 12, 2012, 08:46:17 AM »

Inserting LOTW files should be pretty easy, actually.

Yep. As I've monitored the numbers over the last few weeks and have done the math, they are processing at most 2-3 QSOs per second. Even though they're dealing with a fair amount of data to query, this is extremely slow. I'd expect no more than a few milliseconds per transaction. My experience has shown that unless you are very seriously underpowered, hardware upgrades are simply a temporary bandaid on a more serious database/query design problem.

Not to compare LoTW to Google, but I'm always amazed how Google can query about 100 billion web pages (they index or re-index about 20 billion per day) and give me useful results in less time than it takes me to search for a filename in a folder on my own hard drive.
Logged
WW3QB
Member

Posts: 698




Ignore
« Reply #287 on: December 12, 2012, 09:11:47 AM »

Inserting LOTW files should be pretty easy, actually.

Yep. As I've monitored the numbers over the last few weeks and have done the math, they are processing at most 2-3 QSOs per second. Even though they're dealing with a fair amount of data to query, this is extremely slow. I'd expect no more than a few milliseconds per transaction. My experience has shown that unless you are very seriously underpowered, hardware upgrades are simply a temporary bandaid on a more serious database/query design problem.

Not to compare LoTW to Google, but I'm always amazed how Google can query about 100 billion web pages (they index or re-index about 20 billion per day) and give me useful results in less time than it takes me to search for a filename in a folder on my own hard drive.

Check my math, but at 3 QSOs per second, it will take 35 days to get to a log submitted today processed. So it is too late to submit a log for DXCC year end processing. It also means that it will not catch up before the RTTY roundup logs hit.
Logged
EI2GLB
Member

Posts: 597




Ignore
« Reply #288 on: December 12, 2012, 10:13:26 AM »

Some of you may have seen Clublog now do a Log matching option to see if your QSO matches other users.

Seemly Clublog can upload 4000 per second,

and 2 or 3 hours after they switched on the Log matching it had allready matched 5 million QSO's

and that was only running when they Log Queue was empty.

Clublog is ran as a hobby LOTW is a money making racket

If I was still a member of ARRL I would be giving them a very strongly worded email saying to get there house in order.

I hope you that are members have done so.

Trevor
EI2GLB
Logged
AB8MA
Member

Posts: 767




Ignore
« Reply #289 on: December 12, 2012, 10:23:05 AM »

Almost 2% of the total number of QSO's processed by LoTW since it started are currenty sitting in the queue waiting to be processed.
Logged
W9KEY
Member

Posts: 1165




Ignore
« Reply #290 on: December 12, 2012, 12:39:00 PM »

Almost 2% of the total number of QSO's processed by LoTW since it started are currenty sitting in the queue waiting to be processed.

are you sure?  that would be 1 out of every 50  Undecided  do duplicates get stuck in the queue and never discarded?
Logged
NI0C
Member

Posts: 2438




Ignore
« Reply #291 on: December 12, 2012, 12:54:31 PM »

Quote
are you sure?  that would be 1 out of every 50

Let's do the math:
The current backlog reported is approx. 8.94 million QSO records.  The database contains approx. 462 million QSO records.  Dividing 8.94 by 462 and multiplying by 100 yields 1.9 percent. 

This is significant, because it shows how usage of LoTW has accelerated recently.  LoTW has been in place for maybe 100 months (more or less).  Dividing the database size of 462 million records by 100 yields an average of 4.6 million records per month input to the system.  Thus the current backlog is nearly two average months worth of QSO's. 

73,
Chuck  NI0C
Logged
K4JK
Member

Posts: 320




Ignore
« Reply #292 on: December 12, 2012, 01:01:14 PM »

I would wager a significant portion of those 8.9 million QSOs are duplicates from people uploading their entire or partial logs multiple times. Sure there is a sizable amount of contest logs too but I highly doubt organic growth caused usage to spike that much in the past two months.
Logged

ex W4HFK
N3QE
Member

Posts: 2432




Ignore
« Reply #293 on: December 12, 2012, 01:27:30 PM »

Thus the current backlog is nearly two average months worth of QSO's.  

Yes, it is true. CQ WW CW is nearly two average months worth of QSO's in 48 hours :-). Add on ARRL 160, the 10M contest, the sweepstakes the few weeks before that... and come to the conclusion that November is hardly an "average" month!!!!

By the way Chuck, your very nice paper QSL card arrived several days before my 160 log was processed by LOTW :-)

Tim.
« Last Edit: December 12, 2012, 01:32:27 PM by N3QE » Logged
N7SMI
Member

Posts: 373




Ignore
« Reply #294 on: December 12, 2012, 02:22:33 PM »

Quote
Check my math, but at 3 QSOs per second, it will take 35 days to get to a log submitted today processed.

Correct. For kicks, today I sampled several hours worth of status data from the LoTW site and the average has been 159.8 QSOs processed per minute (high of 280 and low of 44), or 2.66 per second. At that rate it will take 39 days to process the 9 million QSOs currently in the queue.

I uploaded logs about every other day through November and the most recent QSL I have is November 19th which was 22 days ago.

I'll repeat my data collection regularly and see how things change, but it seems to me that the entire system is teetering on the edge of going tits up (<- that may be a colloquialism, but I think you get the idea).
Logged
NI0C
Member

Posts: 2438




Ignore
« Reply #295 on: December 12, 2012, 02:41:28 PM »

N3QE wrote:
Quote
By the way Chuck, your very nice paper QSL card arrived several days before my 160 log was processed by LOTW :-)

I'm glad you liked the card, Tim.  I enjoyed working with KB3IFH to design some cards using my own photos.   
73,
Chuck  NI0C
Logged
N4CR
Member

Posts: 1703




Ignore
« Reply #296 on: December 12, 2012, 06:03:31 PM »

For kicks, today I sampled several hours worth of status data from the LoTW site and the average has been 159.8 QSOs processed per minute (high of 280 and low of 44), or 2.66 per second.

I work for a fortune 100 company as a software and database specialist.

If I heard a system was doing what is stated above, there is no way I'd start by looking for a problem in hardware. Without being able to do hands on analysis, I'd guess this system is doing table scans on reads or writes. Once the size of the database exceeds memory constraints, it becomes disk channel bound. Which drops the speed by a factor of about 1000. A primary indication of this is that the CPU load is low and the disk channel (read/write light) is balls to the wall.

If it is this, the temporary cure is to add ram to get all of the database in memory again, the real fix is to find out why the table scans are occurring and fix the design flaw.

But this is all wild speculation from hallway talk and that's a HUGE if.

From experience, I've found that throwing hardware at a problem should be the last step after every advantage can be wrung out of the software.
Logged

73 de N4CR, Phil

Never believe an atom. They make up everything.
W9KEY
Member

Posts: 1165




Ignore
« Reply #297 on: December 14, 2012, 01:53:19 AM »


I work for a fortune 100 company as a software and database specialist.

If I heard a system was doing what is stated above, there is no way I'd start by looking for a problem in hardware. Without being able to do hands on analysis, I'd guess this system is doing table scans on reads or writes. Once the size of the database exceeds memory constraints, it becomes disk channel bound. Which drops the speed by a factor of about 1000. A primary indication of this is that the CPU load is low and the disk channel (read/write light) is balls to the wall.

If it is this, the temporary cure is to add ram to get all of the database in memory again, the real fix is to find out why the table scans are occurring and fix the design flaw.

But this is all wild speculation from hallway talk and that's a HUGE if.

From experience, I've found that throwing hardware at a problem should be the last step after every advantage can be wrung out of the software.

any chance you will be in CT anytime soon and are feeling charitable?    Cheesy

seriously though, how much time would you expect a good analysis of the software to take, and how expensive is that?
Logged
N2RJ
Member

Posts: 1238




Ignore
« Reply #298 on: December 14, 2012, 06:12:05 AM »


Not to compare LoTW to Google, but I'm always amazed how Google can query about 100 billion web pages (they index or re-index about 20 billion per day) and give me useful results in less time than it takes me to search for a filename in a folder on my own hard drive.

I've been inside a Google datacenter and they have a metric ton of hardware, literally.

Of course that's not the whole story. They are also very innovative with software, making extensive use of things like memcache (in YouTube).

We switched over a lot of our stuff here to use memcache and the performance increase was pretty staggering.
Logged
N2RJ
Member

Posts: 1238




Ignore
« Reply #299 on: December 14, 2012, 06:14:24 AM »

I have never been a DBA, but it seems to me that some threshold has been exceeded and needs to be analyzed. Updated hardware is not a long term fix.

I sure hope they aren't using MySQL with the MyISAM engine - and table level locking . . . . . and single threaded ETL. Hardware won't fix that. I just fixed a big Data Warehouse that had this problem at Lithium Technologies . . . . OUCHIES . . .

That was a pain in the neck for us, especially since we were using that god awful MySQL replication. We've now switched to InnoDB. Much better.
Logged
Pages: Prev 1 ... 15 16 17 18 19 [20] 21 22 23 24 Next   Go Up
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.11 | SMF © 2006-2009, Simple Machines LLC Valid XHTML 1.0! Valid CSS!