Mechanical APDL Basic Analysis Guide

(Axel Boer) #1

symmetric, complex, definite, and indefinite matrices. The QMR solver is more robust than the ICCG
solver.


5.3. Solver Memory and Performance


For best performance, understand the individual solvers' memory usage and performance under certain
conditions. Each solver uses different methods to obtain memory; understanding how memory is used
by each solver can help you to avoid problems (such as running out of memory during solution) and
maximize the problem size you can handle on your system.


5.3.1. Running Solvers Under Shared Memory


One of the easiest ways to improve solver performance is to run the solvers on a shared-memory archi-
tecture, using multiple processors on a single machine. For detailed information about using the shared-
memory architecture, see Using Shared-Memory ANSYS in the Parallel Processing Guide.


The sparse solver has highly tuned computational kernels that are called in parallel for the expensive
matrix factorization. The PCG solver has several key computation steps running in parallel. For the PCG
and sparse solvers, there is typically little performance gain in using more than four processors for a
single job.


5.3.2. Using Large Memory Capabilities with the Sparse Solver


If you run on a 64-bit workstation or server with at least 8 GB of memory and you use the sparse solver,
you can take advantage of ANSYS' large memory capabilities. The biggest performance improvement
comes for sparse solver jobs that can use the additional memory to run in-core (meaning that the large
LN09 file produced by the sparse solver is kept in memory). Generally, you need 10 GB of memory per
million degrees of freedom to run in-core. Modal analyses that can run in-core using 6 to 8 GB of memory
(500K - 750K DOFs for 100 or more eigenmodes) shows at least a 30 to 40 percent improvement in time
to solution over a 2 GB system.


You can configure memory for sparse solve in-core runs explicitly using the BCSOPTION command,
but the easiest way to access this capability is to increase the initial memory allocation so that the
amount of memory available to the sparse solver exceeds the in-core memory requirement. The per-
formance improvement over a 32-bit system configured with nominal I/O performance can be even
more significant when the sparse solver memory requirement for optimal out-of-core operation is larger
than a 32-bit system can allocate. In such cases, I/O for the sparse solver factorization can increase fac-
torization time tenfold on 32-bit systems compared to larger memory systems that run either in optimal
out-of-core mode or in-core.


An important factor in big memory systems is system configuration. The best performance occurs when
processor/memory configurations maximize the memory per node. An 8-processor, 64 GB system is
much more powerful for large memory jobs than a 32-processor 64 GB system. The program cannot
effectively use 32 processors for one job but can use 64 GB very effectively to increase the size of
models and reduce solution time. The best performance occurs when jobs run comfortably within a
given system configuration. For example, a sparse solver job that requires 7500 MB on a system with
8 GB does not run as well as the same job on a 12-16 GB system. Large memory systems use their
memory to hide I/O costs by keeping files resident in memory automatically, so even jobs too large to
run in-core benefit from large memory.


All ANSYS, Inc. software supports large memory usage. It is recommended for very large memory machines
where you can run a large sparse solver job in-core (such as large modal analysis jobs) for the greatest
speed and efficiency. To use this option:


Release 15.0 - © SAS IP, Inc. All rights reserved. - Contains proprietary and confidential information

Solution

Free download pdf