FAQs|AI for Critical Mineral Assessment Competition

Last updated on October 10, 2022. Check back regularly to view updates.

Update on 9/1/2022: Section Eligibility and Team Questions, Question 6C
Update on 9/16/2022: Technical Competition Questions, Question 3; Eligibility and Team Questions, Questions 2, 8-10
Update on 9/23/2022: General Questions, Question 6A, Eligibility and Team Questions, 8A and B
Update on 10/10/2022: General Questions, Question 6A, Added Section with Questions and Answers from Office Hours held on 10/7/2022
Update on 10/14/2022: Added Sections with Questions and Answers from Office Hours held on 10/11/2022 and 10/13/2022
Update on 11/1/2022: Added Sections with Questions and Answers from Office Hours held on 10/28/2022
Update on 11/10/2022: Added Sections with Questions and Answers from Office Hours held on 11/8/2022

General Questions

What is DARPA?
1. The Defense Advanced Research Projects Agency (DARPA) mission is to make pivotal investments in breakthrough technologies for national security. Working with innovators inside and outside of government, DARPA has repeatedly delivered on that mission, transforming revolutionary concepts and even seeming impossibilities into practical capabilities. The ultimate results have included not only game-changing military capabilities such as precision weapons and stealth technology, but also such icons of modern civilian society such as the Internet, automated voice recognition and language translation, and Global Positioning System receivers small enough to embed in myriad consumer devices. DARPA explicitly reaches for transformational change instead of incremental advances.
What is USGS?
1. The United States Geological Survey (USGS) monitors, analyzes, and predicts current and evolving Earth-system interactions and delivers actionable information at scales and timeframes relevant to decision makers.1 USGS performs mineral assessments to project the potential for undiscovered domestic and global deposits of minerals such as copper and rare-earth elements. Land management agencies, industry, and the public use this information to help predict future resource development.2
What is the difference between “competition” and “challenge”?
1. The DARPA competition is comprised of two challenges, the Map Georeferencing Challenge and the Map Feature Extraction Challenge. It is not mandatory to participate in both when entering the competition.
2. Each challenge within the competition will have its own leader board and podium, independent of each other.
What are the official social media handles of DARPA to stay up to date on the mini challenge?
1. Visit www.DARPA.mil for links to DARPA on all major social media platforms.
Is it safe to assume that if the rules do not explicitly forbid something, it is allowed?
1. The requirements set forth in the competition guidelines are designed to avoid imposing too many limitations while giving teams the opportunity to be creative.
When will the winners of the challenges be notified?
1. DARPA will notify winners by October 27 for the Map Georeferencing Challenge and by November 18 for the Map Feature Extraction Challenge.
2. There is no “competition winner”. Each challenge will run a separate leader board and will have its own winners.
Why should I participate in the DARPA competition?
1. Teams should participate in the competition because it provides the research and development community with an early glimpse into this domain and problem space before exploring more complex problems. DARPA will award prizes for the top submissions and work with USGS to implement solutions. Most importantly, DARPA and USGS will apply lessons learned from the challenges to future improvement efforts.
Who do I need to contact if I have a question that is not listed here?
1. Please email ai4cma-questions@mitre.org with any questions.

Technical Competition Questions

What is the methodology with regards to team submissions?
1. After you register for a challenge, your team’s primary contact will receive an email with instructions on how to access datasets and deliver submissions.
2. Each challenge will come with its own set of instructions.
What is the methodology with regards to scoring data?
1. The scoring data and methodology for each challenge differs based on the evaluation criteria.
2. Please refer to the section titled “Evaluation Metrics” for the respective challenge on which your team is working.
3. Each challenge is evaluated separately. Performance in one challenge does not affect the performance in the other challenge, should your team choose to register to both challenges in the competition.
What purpose does the provided baseline solution serve?
1. The baseline solution is primarily intended to provide an example of a properly formatted output for the challenge.
2. The baseline solution represents a simplistic approach and is not intended to serve as the starting point for teams’ approaches to solving the challenge.

Eligibility and Team Questions

Is there a fee to enter the competition?
1. There is no fee to enter each challenge. Therefore, participation in the competition is completely free.
How do we define a team?
1. A team consists of one or more people working collaboratively to solve a challenge.
2. Participants can be the primary contact of only one team for each challenge.
How many people are allowed on a team?
1. There are no restrictions to a team’s size.
Are all team members required to submit an entry on the registration form?
1. No, only the primary contact person of the team is responsible for submitting the registration form.
2. In the event the primary contact person changes, please email ai4cma-questions@mitre.org with the subject “Team Name + Change of Primary Contact”.
My team is registering for both challenges in the competition. Should the primary contact be the same for both registrations?
1. The primary contact for each challenge does not have to be the same.
Who is eligible to participate in the DARPA competition?
1. We invite organizations, academic institutions, teams, and individuals from across the globe to register and participate.
2. When you register, you must provide a country of residence for the primary contact.
3. During the award process, DARPA will review the citizenship of team participants. Any challenge prize is authorized under 15 U.S.C. § 3719. The Government reserves the right to withhold prizes to ineligible and disqualified individuals. Tax treatment of prizes will be handled in accordance with U.S. Internal Revenue Service guidelines. To receive a monetary prize, participants must provide a U.S. social security or taxpayer identification number (TIN). Information on how to obtain a TIN is available on the U.S. Internal Revenue Service website at www.IRS.gov. Foreign nationals will be subject to the U.S. as well as local tax codes applied to prize awards.
Can a team participate in only one challenge?
1. Yes. When entering the competition, there is no obligation for any team to register for both challenges. You can choose which challenge to participate in or register for both.
2. If your team wants to register for the entire competition (i.e., both challenges), please ensure that your team registers for each challenge separately within the appropriate registration window.
Can teams or team members join a competition after the registration deadline?
1. Team registrations will not be accepted after the registration deadline for the Map Georeferencing Challenge.
2. Teams interested in registering for the Map Feature Extraction Challenge beyond the official registration close date should email ai4cma-questions@mitre.org.
Are teams allowed to collaborate with other teams? Will DARPA provide contact information for other teams participating in a competition?
1. Since this is a competition and prizes go to individual teams, teams are expected to work independently on their unique approaches/solutions.
2. For legal reasons, DARPA is prohibited from sharing contact information for teams or their primary contacts.
Can a person be a member of more than one team participating in a challenge?
1. Yes, as long as each team is pursuing a unique technical solution.

Questions Regarding Submissions

How exactly will DARPA use the information I submit for the competition?
1. All teams competing in the challenges are agreeing to allow DARPA the ability to use their work for future improvement efforts at USGS. There is the possibility that work submitted can become public knowledge. DARPA and USGS agree to give credit to the submission's original owners should it be decided that their work will be used in the future.
2. Registrants participating in the competition are agreeing to allow DARPA use of their information (name, contact info, etc.).
What should I expect if I/my team are selected as a winner of one of the challenges?
1. For each of the challenges, the winning teams will be contacted via email regarding next steps.
What is your policy on plagiarism?
1. Teams are allowed to re-use algorithms/techniques developed by others provided that sources are cited appropriately. Teams agree to release DARPA and USGS from any liability with regards to plagiarism.

Office Hours held on 10/7/2022

Question (Q): Is the deadline fixed on October 13?

Answer (A): Yes, final submissions are due October 13th at midnight (11:59 PM Eastern). Less than one week away.

(Q): Is it possible to extend the deadline?

(A): No

(Q): Are the maps sort of direct from the scanner or were they manipulated with any operations for the challenge?

(A): Many of the maps are scans of paper maps. The maps provided in challenge are always tiff files, but otherwise have not been manipulated.

(Q): Some tiff files contain multiple maps within them (for example GEO_0009 and GEO_0540). Should our lat-lon predictions be based off the coordinates of a single one of these maps or should they be relative to the nearest sub-map?

(A): If a map image file contains more than one map, use the main map (largest map) and ignore the inset maps, index maps, or other smaller maps. The coordinates for the control points should be relative to the main map, even if they plot outside the main map boundary. The clue coordinates should fall within the "main map".

(Q): What about maps similar to GEO_0540? There are multiple maps of the same size.

(A): If that map is causing problems, you can exclude it from the training set. For this map (GEO_0540), there are four panels, but only the upper left one is the main panel for all georeferencing. Thus, the control points are given relative to the position of the upper left map panel. The clue file gives coordinates for where this map is approximately located.

(Q): How many maps are in the final eval next week?

(A): We cannot reveal details about the evaluation data set.

(Q): Will this recording be uploaded somewhere and available for reference?

(A): Yes, we will update the FAQ with responses to these questions and future office hours.

(Q): Is there any instruction on how and where we should upload the code?

(A): From the Challenge details (https://criticalminerals.darpa.mil/Files/Georeferencing_Challenge_Details.pdf)

If your submission is among the top five best scores, you will also be asked to submit a technical package which includes your code, a brief description of your technical approach and reference links to any external training sets you used, alongside your results in accordance with the Competition Rules (https://criticalminerals.darpa.mil/Files/AI4CMA_Competition_Rules.pdf). Refusal to provide these items may result in a disqualification. Failure to meet these criteria will result in a disqualification.

Office Hours held on 10/11/2022

(Q): There are triangular labels, using the baseline, I can extract as triangles, but some are extracted as rectangles.

(A): There should be one feature per annotation polygon. The challenge is representing what a human would do if manually digitizing maps. Some legend items may contain more than one color, pattern, or text label.

(Q): Do I need to extract the different numbers associated with a symbol?

(A): Focus on extracting the location of the different symbols, which in this case are foliation and bedding symbols. You can ignore the different numbers, which signify attitude measurements.

(Q): Validation labels were released and only included 216 of the 305 total validation maps. Does that mean something or is an error?

(A): Not all maps will be scored. This was intentional.

Office Hours held on 10/13/2022

(Q): Can the test dataset contain new labels for either poly, point and line? New labels which are not in training and validation.

(A): The style and types of features in the training and validation sets are representative of evaluation set. New labels will be provided for the evaluation maps.

(Q): What is a _pt_poly.tif type? Are these two separate labels or just one?

(A): The final character string after the last underscore indicates the feature type (_pt, _line, or _poly).

(Q): For the point prediction output should the output value be 1 or 255?

(A): For any feature, the output should be a binary raster (black and white image), where only the pixels that contain the feature being extracted are encoded as “1” for feature present and all other pixels are coded “0” for feature absent. Match the format provided in the training data, which is a single band 8-bit unsigned .tif with values of 0 or 1.

(Q): Maps ending in "mosaic" have 5 point features for detection -- can we assume that the same 5 features will always be in the same order (i.e., _pt1 = "cross shape", _pt2 = "box shape", etc.)?

(A): All the training maps ending in mosaic have the same legend and the five features are labeled similarly. You should not rely on this relationship holding true across other maps, with or without “mosaic” in the filename.

(Q): Can you explain the binary rasters for the overlapping features? It was somehow explained in the Map_feature_Extraction_Challange_Details.pdf but I still am not sure if I understand it correctly.

(A): In general, the solution to this challenge attempts to mimic what a human would do when digitizing the map. For example, when a bedrock polygon is obscured by a water feature, such as a lake or reservoir, the bedrock feature is assumed to be continuous beneath the water. In this case, the color of the water (blue) may not match the legend color for the bedrock polygon. Similarly, a thin surficial deposit of stream sediment (e.g., Qal) may obscure a bedrock polygon.

(Q): Will submissions be inspected for differences in format, or de-bugging of silly mistakes?

(A): No feedback on formatting or attempt at de-bugging of submissions will be made. Please use validation rounds to help discover mistakes and verify that your output will be evaluated correctly.

(Q): The legends that will be in the validation and test, are they all included in the training set?

(A): No, each map should be thought of containing a unique set of features that pertain only to that map. The feature identified in the legend of one map may not correspond to the same feature on another map. The color matching should only be done in a single map.

(Q): Is there a reference for how it is usually done?

(A): I am not aware from an existing solution for this problem aside from the baseline provided. I encourage you to look at the literature or approaches used in other fields, such as medical imaging or remote sensing fields. There might be existing models that could be productive here.

(Q): Are there universal symbols?

(A): No, but there are similarities. For the most part, the maps are USGS products. The USGS tries to use consistent symbology across products, but over 100+ years, there has been evolution in that symbology. Some look the same, but there are exceptions.

(Q): Do you have to use open CV?

(A): No, you are not required to use any specific software package or library. The validation script uses open CV, so you will need to use it to run it yourself. If you want, you can create a separate environment to run the validation script to not interfere with yours.

Office Hours held on 10/28/2022

(Q): Having a hard time submitting validation rounds. I requested in the submission portal, but

(A): We have experienced that some teams have had problems with the submission portal. We want to have plenty of time to work through problems before the final submission on 17 November. Please follow these steps at your earliest opportunity if you have not already. This is the same submission portal used for the validation rounds.

Please go to the following url to gain access to the submission portal: https://usgs.darpachallengeuploads.us/
You will be asked to validate your email address; you must use the primary POC email address used during registration.
You will promptly receive an email from noreply@usgs.darpachallengeuploads.us; please check spam and junk email folders in addition to your inbox.
If you have any issues, please let us know promptly by emailing ai4cma-questions@mitre.org.

(Q): There are additional requirements for a technical package? Will you want code?

(A): Only the top 5 scoring teams need to submit a technical package. Yes, code should be a part of the technical package, in addition to a written description of the method.

(Q): What is the specific time of day to release the eval dataset? We might need the full 24 hours.

(A): For the Map Georeferencing Challenge, we released the data at 8am on Wednesday and submissions were due 11:59 on Thursday. We anticipate a similar timeline for the Map Feature Extraction Challenge.

(Q): What feedback is provided as a part of validation rounds?

(A): After you submit during a validation round, you will receive the top 5 and bottom 5 scores for the validation set. The format of the feedback is similar to the results from running the baseline.

(Q): Will the ground truth of both sets be released post-competition?

(A): The data is actually already published by USGS, but we did some additional processing on it as a part of this competition. Once the competition is over, USGS will consider releasing it as a USGS data release so that it is citable, especially if it helps with further research.

(Q): If my output is a matrix of the same size, is that the output you are looking for?

(A): Yes, we ask for a tiff with three identical bands; that is similar to the binary raster answer key is and the training data.

(Q): Is it possible to have two submission dates, original and new?

(A): Unfortunately, we made the change to give more time to people. For the Map Feature Extraction Challenge, the only final submission date is 17 November.

(Q): The baseline code does not appear to work with all training data. For some shapes, the given code assumes a certain order when it does not have that order. It is minor and not hard to fix.

(A): It does not work particularly well for all feature types and maps. It may have to do with how the labels are constructed. This is meant to replicate a human in the loop effort and labeling the feature replicated the human selection of features. The correct solution would not rely on the labels to be exact.

(Q): In the Genesis_fig9_2_5_25_pt.tif map, the legend contains points and lines. In the json, there are two shapes with the same label, one is the point type and the other is what should be a line.

(A): This is not a systematic problem you will encounter with a lot of maps and features; you can ignore it and be fine.

(Q): Have we been given maps with all the shapes you come up with?

(A): For point symbols, there tends to be more of a standard set of symbols (see https://ngmdb.usgs.gov/Info/). This reference only pertains to topographic maps (it is not comprehensive): https://pubs.usgs.gov/gip/TopographicMapSymbols/topomapsymbols.pdf

Office Hours held on 11/8/2022

(Q): Can you confirm that evaluation data will be representative of training and validation data sets?

(A): Yes, the types of maps and relative frequency of features (points, polygons, and lines) in the evaluation set are broadly consistent with those in the training and validation sets.

(Q): Are the number and size of maps in the evaluation set similar to validation or training?

(A): The training set is larger than the validation set. The size of the training set was too cumbersome for some people. Evaluation will be not as extensive as the training, but will be more like the validation set.

(Q): Are there any mislabeled labels in training set?

(A): Yes, there could be some slight mistakes in the training set. The human annotation step accounts for a few of the discrepancies. It is good that you are detecting these, but they will not be present in the evaluation set.

(Q): Can you explain the difference between the image size in the .tif and .json?

(A): The image size in pixels should be identical. If that is not the case you can skip the map as it will not be evaluated.

(Q): I am finding discrepancies in evaluation code regarding the x,y dimensions. What difference does this make for the evaluation metric?

(A): As long as your submission is in the correct format - 1 for feature present, 0 for feature absent - the evaluation metric will be able to appropriately score your submission. There is nothing you need to do differently. An updated evaluation metric was released, but it makes very slight difference in scores.

(Q): The challenge details document mentions a validation example on Amazon S3; how do we access that?

(A): Submissions should be identical to training binary rasters; your submission should look like that. Pixel dimensions should be the same as the base map for the corresponding binary raster.

(Q): When testing the baseline, I do not see the points. Is that an issue?

(A): Points are so small that in the window, you cannot see them. You can adjust the zoom or view the output in another program to verify their presence.

(Q): I ran into an error running the baseline. When saving the file, rasterio displays “Warning: Data is not georeferenced.”

(A): This might be an environment issue. Check that your installed libraries match the versions in the baseline code notebook. The provided maps are georeferenced, but you can ignore the georeferencing information as long as pixel dimensions are the same as the base map.

(Q): Is there anywhere in the json files we are given the x,y coordinates of the bounding box of the maps?

(A): No, the map area is not in the json. If you predict features outside the map area, that will count against your score.

(Q): Can we manually exclude parts of the map that should not be included?

(A): It is a pure automation challenge and manual functions should be avoided.

(Q): Regarding point features, is there a way to tell which symbols matter more?

(A): The symbols for points are all equal weight. More counts does not mean more points as the score is the median for all point features. Any attempt is better than no attempt.

(Q): When trying to run model, tried to resize the maps and feed the model. For point features, it will be hard to see the point features.

(A): You can resize images or make tiled versions to help processing but the answers must be delivered as binary rasters with the same pixel dimensions as the original map. Point features will be more accurate if the full resolution map image is used; polygons may be extracted after down sampling.

References

[1] “Who We Are | U.S. Geological Survey.” USGS. www.usgs.gov/about/about-us/who-we-are (accessed: Jul. 14, 2022).

[2] “Mineral Resources Program | U.S. Geological Survey.” www.usgs.gov/programs/mineral-resources-program (accessed: Jul. 13, 2022).

[3] “About DARPA.” www.darpa.mil/about-us/about-darpa (accessed: Jul. 13, 2022).