[Ecm-dev] Resuming ECM computations with GMP-ECM 6.0.1

Paul Zimmermann Paul.Zimmermann at loria.fr
Mar 18 Avr 15:30:52 CEST 2006


       Dear Thorkil,

your question is far from trivial. I'm quite busy now, thus a short answer.

First the 206 runs with B1=50000 assume you start from scratch. If you
already did 77 curves with B1=11000, those are equivalent to about 10
curves with B1=50000 (for 25-digit factors, cf the output of ecm -v),
thus you really need only 196 (fresh) curves.

You can of course extend the 77 (old) curves to B1=50000, but I'm not sure
you'll get a global speedup (although the theoretical question is interesting)
and you also need to maintain the save file, which can become quite complex
if you factor (say) 1000 factors with 1000 curves each.

I'm wishing you good luck with your implementation! You may find some ideas
for stage 2 in the report "20 years of ECM" from my web page.

Paul Zimmermann

> From: Thorkil Naur <naur at post11.tele.dk>
> Date: Mon, 17 Apr 2006 18:06:08 +0200
> 
> Dear Paul Zimmermann,
> 
> Looking at the GMP-ECM package, I note that you seem to be one of the 
> principle contributors. I hope therefore that you will consider the following 
> question.
> 
> On 2006-April-15, I have downloaded ecm-6.0.1.tar.gz, This seems to be the 
> latest available version of the GMP-ECM package. In the README file, I find 
> the following in the section headed "6. Options -save and -resume.":
> 
> (b) somebody did a previous step 1 for P-1 or P+1 up to a given bound
>     B1, and you want to extend that computation with B1' > B1, without
>     restarting from scratch. Note: this does not apply to ECM, where 
>     the smoothness property depends on the (random) curve chosen, not
>     only on the input number.
> (c) ...
>     For the same reason as (b), this does not apply to ECM. 
> 
> The question concerns the statement "this does not apply to ECM". What does 
> this mean? As far as I have been able to tell, saving the ECM step 1 state 
> and resuming is certainly supported by GMP-ECM.
> 
> To elaborate this question, let's say that we are factoring some large 
> composite N with no prior knowledge of its factors. Ignoring for simplicity 
> P-1 and P+1, "2. How to efficiently use P-1, P+1 and ECM?" advices us to 
> carry out the following ecm runs:
> 
> - 77 runs with B1=11000 (assuming "default poly" and "default B2")
> - 206 runs with B1=50000
> - 401 runs with B1=250000
> 
> And so on, hopefully finding a factor at some point.
> 
> Consider now the situation after running the 206 curves with B1=50000. At this 
> point, a total of 77+206=283 curves have been used, the first 77 with 
> B1=11000, the remaining 206 with B1=25000.
> 
> However, wouldn't we be better off if we, instead of running 206 entirely new 
> curves, continued the work performed for the first 77 curves (extending from 
> B1=11000 to B1=50000) and only then computed 206-77=129 entirely new curves 
> all the way to B1=50000? In this way, work corresponding to the computation 
> of the 77 curves to B1=11000 would be saved.
> 
> I realize, of course, that having run the 283 curves (77 to B1=11000 and 206 
> to B1=50000) gives a somewhat higher probability of finding a factor than 
> having only run 206 curves to B1=50000. But the work used to run the 
> additional 77 curves to B1=11000 seems better spent on getting ahead with the 
> 401 runs with B1=250000.
> 
> This idea carries on, of course, with the 206 curves with B1=50000 being used 
> as a starting point for the 401 curves to B1=250000 etc.
> 
> I would be very grateful for any comments that you may have to this question.
> 
> I should note that I am working on my own implementation of ECM (and other 
> integer factorization methods). In my implementation, the complete state of 
> the factorization of some number can be stored in a single file. This is used 
> to checkpoint the state of long computations regularly, say, once an hour. It 
> also allows computations to be interrupted and resumed easily.
> 
> The state stored for a number can include the ECM state for any number of 
> curves, thus allowing ECM computations to be extended in any desirable way.
> 
> (I should also note that with respect to performance, my implementation lacks 
> considerably when compared to GMP-ECM. In particular, my stage 2 should 
> probably be thrown in the garbage can.)
> 
> Many thanks for any time that you spend on this.
> 
> Best regards
> Thorkil Naur
> 




Plus d'informations sur la liste de diffusion Ecm-discuss