A Tour of NTL: NTL past, present, and future
Some History
Work on NTL started around 1990, when I wanted to implement some new
algorithms for factoring polynomials over finite fields.
I found that none of the available software was adequate for
this task, mainly because the code for polynomial arithmetic in
the available software was too slow.
So I wrote my own.
My starting point was Arjen Lenstra's LIP package for long integer
arithmetic, which was written in C.
It soon became clear that using C++ instead of C
would be much more productive and less prone to errors,
mainly because of C++'s constructors and destructors
which allow memory management to be automated.
Using C++ has other benefits as well, like function
and operator overloading, which makes for more readable code.
One of the basic design principles of LIP was portability.
I adopted this principle for NTL as well, for a number of reasons,
not the least of which was that my computing environment
kept changing whenever I changed jobs.
Achieving portability is getting easier as standards,
like IEEE floating point, get widely adopted, and as the definition of
and implementations of the
C++ language stabilize.
Since 1990, NTL has evolved in many ways,
and it now provides a fairly polished and well-rounded programming interface.
When I started working on NTL, there really were not that many
good, portable long integer packages around.
Besides LIP, there was the BSD Unix MP library.
The first version of GMP was released in the early 1990's.
At that point in time, LIP seemed like the best starting point.
LIP remains a reasonable long integer package, but in recent years,
GMP has really become quite good: it seems well supported on
many platforms, and is typically much faster than LIP.
I've now re-structured NTL so that one can use
either 'traditional' LIP or GMP as the long integer package.
The Future of NTL
As you can well imagine, there is potentially no end to algorithms one
could implement.
That is why I have to stop somewhere.
I think NTL has reached a point where it provides a reasonably
well-rounded suite of algorithms for basic problems.
I plan to continue supporting NTL, fixing bugs and improving performance.
While I don't have time to add significant new functionality to NTL,
there seems to be an ever-growing number of NTL users
out there, and I encourage them to make their code available to
others.
These might be in the form of NTL "add ons", but there is the
possibility of integrating
new functionality or algorithmic improvements into NTL itself.
Wish list
These are a few things I wish others could perhaps contribute to
NTL.
I'd be happy to discuss and assist with any design and integration issues,
or any other ideas for improvement.
I'd also be happy to discuss ideas for making NTL more
open to make it easier for others to contribute.
-
Support for
bivariate polynomial arithmetic, including GCDs, resultants,
and factoring, and for integer and all the various finite field
coefficient rings.
-
Code for elliptic curves,
including an elliptic curve point counting algorithm.
-
Integer factorization algorithms.
-
Implementations of some of the newer lattice basis reduction algorithms.
-
Improvements to the
polynomial multiplication algorithms over ZZ
could be improved.
One specific improvement: the Schoenhage-Strassen algorithm
currently does not incorporate the so-called "square root of two trick".
-
Improvements to some of the RR algorithms.
In particular, the trig, exp, and log functions are currently woefully
inefficient.
Some things I plan to work on
Here are a few things I plan to work on in the near future.
-
Now that NTL is thread safe, it is possible to use multiple cores
within NTL to improve performance.
One possibilty is to utilize multiple cores in the modular
FFT implementation of polynomial multiplication.
Both the FFT (over different small primes) and reduce/CRT
(over different coefficients) steps are trivially parallelizable.
-
Introducing specialized types for single-precision computation.
NTL relies on fast modular arithmetic of single precision values.
Right now the type long is used to store these
values, and the type double is used to store
floating point approximations to auxilliary values (e.g., inverses).
I'd like to introduce a level of "type indirection" here.
This would allow two new implementations:
-
On LP64 platforms that support extended FP,
one could use long double in place of double,
which would allow 60-bit, rather than 50-bit, single precision values.
This could be more convenient and a little faster:
speed-wise, a few things could get slower, but I expect a net gain.
-
On LP64 platforms, use 30-bit single precision values and use
int instead of long.
This would open up the possibility of using vectorized instructions
and/or GPUs.
Both of these specialized hardware features typically do not support
64-bit integers arithmetic.
Again, some things could get slower, while other things faster,
and I would expect a net gain.
In any case, it could not hurt to use type names that indicate
the role that these values play,
and it would open up a number of possibilities for
experimenting with other implementations.
All of these changes can be done in a perfectly backward compatible way.
-
Introduce some C++11 features, like "move constructors"
and "move assignment".
This would have to be done with compile-time flags to support
older compilers.