YAXT: Issueshttps://swprojects.dkrz.de/redmine/https://swprojects.dkrz.de/redmine/redmine/favicon.ico?17095821032017-08-25T14:23:05ZDKRZ projects
Redmine Feature #341 (New): Call MPI_Testany from time to time to improve performancehttps://swprojects.dkrz.de/redmine/issues/3412017-08-25T14:23:05ZMoritz Hanke
<p>While doing some tests with YAC, I noticed that: when setting up a list of MPI_Isends, the performance improved, if from time to time MPI_Testany is called (MPI_Testsome did produce better results).</p>
<p>We should check whether YAXT could also benefit from this.</p>
<p>The following is a pseudo code showing the idea, which is based an the Routine psmile_bsend written by Hubert Ritzdorf for OASIS4.</p>
<pre>
int num_open_requests = 0
MPI_Requests requests[total_num_send_recv_msgs]
for all send/recv msgs
MPI_Request * request = &requests[num_open_requests++]
set up MPI_Isend/MPI_Irecv
int flag = 1, idx
while (flag && (num_open_requests >= 64))
MPI_Testany(num_open_requests, requests, &idx, &flag, MPI_STATUS_IGNORE)
if (flag && (idx != MPI_UNDEFINED))
requests[idx] = requests[--num_open_requests]
MPI_Waitall(num_open_requests, requests, MPI_STATUSES_IGNORE)
</pre> Feature #340 (New): Improve message send order in Xt_exchangerhttps://swprojects.dkrz.de/redmine/issues/3402017-08-25T08:03:24ZMoritz Hanke
<p>Under certain circumstances the message order produced by the routine xt_exchanger_internal_optimize might not be optimal.<br />Example:<br />Program with a total of 20 proces. Each of the first 10 procs sends a message to the last 10 proces.</p>
<p>We might have to think about another way of improving the message order.<br />An Evolutionary algorithm could be a solution.</p> Feature #338 (New): User-creatable index listshttps://swprojects.dkrz.de/redmine/issues/3382016-02-05T16:23:47ZThomas Jahnsjahns@dkrz.de
<p>Users should be able to create index list classes if they can describe their indices more succinct this way. The following parts would be needed for such:</p>
<ul>
<li>Registering a pack tag and making sure it's identical on all ranks.</li>
<li>Registering an unpack function.</li>
<li>Optionally registering intersection function(s) for different (other) index lists and extending the table in xt_idxlist_intersection.c correspondingly.</li>
</ul>
<p>Anything I forgot?</p> Feature #337 (New): check possibility of usage of mpi_type_create_hvector in xt_redist_collection...https://swprojects.dkrz.de/redmine/issues/3372015-04-27T12:34:37ZJoerg Behrensbehrens@dkrz.de
<p>If the sequence of redists given in the constructor always uses the same redist then we should reduce the created datatype to a simple form (using mpi_type_create_hvector instead of MPI_Type_create_struct).</p> Task #336 (New): Compare performance of YAXT and other coupling middlewarehttps://swprojects.dkrz.de/redmine/issues/3362015-04-24T13:13:20ZThomas Jahnsjahns@dkrz.de
<p>The following projects deliver partially similar functionality and should be investigated for relative performance:</p>
<ul>
<li>MCT</li>
<li>ESMF</li>
</ul> Bug #335 (New): passing pointers of zero-size arrays to redistshttps://swprojects.dkrz.de/redmine/issues/3352015-04-23T09:45:59ZMoritz Hanke
<p>The addresses of arrays passed to xt_redist_s_exchange can be NULL, when the size of the array is zero. This can cause problems especially when redist collections are used.<br />For zero-sized arrays a redist should not contain any message. When this happens we might have to consider applying some kind of special handling.<br />When the Fortran interface is used, no appropriate C_LOC-pointer can be generated.</p>
Tasks:
<ul>
<li>reproduce problem with test (only occurs with certain MPIs)</li>
<li>fix problem of define that NULL pointers are not allowed</li>
</ul> Feature #333 (Resolved): Caching of communicatorshttps://swprojects.dkrz.de/redmine/issues/3332014-08-06T16:29:45ZThomas Jahnsjahns@dkrz.de
<p>YAXT currently creates communicators internally to provide isolation from other parts of the system. This can be potentially costly when it introduces additional synchronization. For this reason it seems sensible to cache previously created communicators instead of destroying them immediately.</p>
<p>An alternative scheme requires managing tags in the library more closely and is potentially less resource intensive (depending on how costly a communicator is).</p> Bug #331 (New): Make xt_xmap_distdir work when less data than expected gets packedhttps://swprojects.dkrz.de/redmine/issues/3312014-07-31T08:37:03ZThomas Jahnsjahns@dkrz.de
<p>commit:daacfe17cb9124f4f0f3763858cc94ff666efb4a fixes a problem in xmap_all2all that is also present in the distributed directory variant: if the get_pack_size method of an index list returns a value that is larger than the actual advance of position that happens due to the pack method, distdir fails.</p>
<p>Steps to reproduce: simply add one to the MPI_Packsize count argument of e.g. source:src/xt_idxvec.c#L369</p> Feature #321 (New): runtime switch for the generation of MPI datatypeshttps://swprojects.dkrz.de/redmine/issues/3212013-06-27T13:32:47ZJoerg Behrensbehrens@dkrz.de
<p>There are now two ways to generate dataypes (src/xt_mpi.c)</p>
<p>(a) fast, but without exploiting potential for a compact description<br />(b) less fast, but with ...</p>
<p>So far the switch is a cpp symbol: COMPACT_DT in src/xt_mpi.c</p>
<p>We need a runtime switch in order to test both versions without recompilation</p> Documentation #320 (New): Find method to document Fortran interfacehttps://swprojects.dkrz.de/redmine/issues/3202013-06-04T11:26:22ZThomas Jahnsjahns@dkrz.de
<p>Current doxygen versions have diverse weaknesses regarding Fortran, other Fortran-heavy projects like PIO or MCT might have something in store.</p>
<p>This needs further evaluation before large amounts of documentation are written.</p> Feature #319 (New): support for two-phase-redistributionhttps://swprojects.dkrz.de/redmine/issues/3192013-04-16T12:27:30ZMoritz Hanke
<p>Redistributions could also support a two-phase-redistribution scheme, that has has two communication steps. This potentially allows for a reduction in the number of messages.</p>
Some thoughts on this:
<ul>
<li>probably only reasonable for in-place redistributions (see <a class="issue tracker-2 status-1 priority-4 priority-default" title="Feature: support for in-place redistributions (New)" href="https://swprojects.dkrz.de/redmine/issues/317">#317</a>)</li>
<li>the two phase could be detected in xmap-generation or redist-generation step</li>
</ul> Feature #317 (New): support for in-place redistributionshttps://swprojects.dkrz.de/redmine/issues/3172013-03-11T15:25:19ZThomas Jahnsjahns@dkrz.de
<p>Redistributions should also work when input and output arrays are the same object. This needs the following improvements:</p>
<ul>
<li>Detect indices specified in redist construction that have same offsets in input and output on local task.</li>
<li>Remove these indices from copy operation, this requires semantics for idxlist element removal (in sync with offset lists).</li>
<li>Detect in-place case from array arguments and use buffered send or something similar for communication to prevent overwriting of indices also sent to other processes.</li>
</ul> Feature #310 (New): new xmap constructor: xt_xmap_dist_dir_dim_newhttps://swprojects.dkrz.de/redmine/issues/3102012-11-26T15:15:33ZJoerg Behrensbehrens@dkrz.de
<p>The implementation should use the routine xt_idxlist_get_bounding_box (see issue <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: new idxlist method: xt_idxlist_get_bounding_box (Closed)" href="https://swprojects.dkrz.de/redmine/issues/309">#309</a>) in order to efficiently preselect intersection candidates coming from the distributed directory bucket lattice for later intersection computation.</p>
<pre>
Xt_xmap xt_xmap_dist_dir_dim_new(Xt_idxlist src_idxlist, Xt_idxlist dst_idxlist,
int ndim, Xt_int global_size[ndim],
Xt_idx global_start_index, MPI_Comm comm);
</pre> Feature #307 (New): idxsection_get_index_stripes_consthttps://swprojects.dkrz.de/redmine/issues/3072012-10-25T15:13:33ZJoerg Behrensbehrens@dkrz.de
<p>Implement const version of idxsection_get_index_stripes (like get_indices_const).</p> Feature #306 (New): garbage collectorhttps://swprojects.dkrz.de/redmine/issues/3062012-10-25T15:11:10ZJoerg Behrensbehrens@dkrz.de
<p>Since we have an idxlist-internal cache that we might not need at some point, we might consider implementing a garbage collector to minimize the memory footprint of idxlists.</p>