Differences

This shows you the differences between two versions of the page.

--- resarch:nlpa:preetha [2016/08/31 10:07]
pollock
+++ resarch:nlpa:preetha [2017/02/16 14:27] (current)
preethac
@@ Line 1: / Line 1: @@
 ====== Preetha's Page ======
-====== Notes from Fall 2016 Meetings ======
+**
+Spring 2017:**
-Work on code example mining from research articles:
-To use for mining code segment comments/descriptions:
-Could pull out sentences related to the code segments in the articles.
-Look for sentences that start with the subject being a method name, by performing chunking to get the subject phrases, verb phrases (just partial parsing).  Could use the Stanford parsing because regular text sentences, not code.
-Change the dictionary creation from code segments to also estimate role of the name in the code. Are they method names, variable names, etc?
-Look at the verbs that are present or future tense.  These are probably describing the code above.
-To use for all buggy code examples and the kinds of bugs:
-Need to figure out which sentences tell you the bug.
-Work on identifying definitions from research articles:
-From Vijay's bio lit work:
-  * googlism - who, what, where, when - is a relation
-  * find what is most common
-  * from genes, looked for 'is a' in bio literature
-  * Look at Marti Hearst - 1992 first work on extracting definitions, etc
-  * Now, today people are trying word embeddings to get relations and compare to her approach
-Contributions:  (A Miner of Definitions from Research Articles)
-  * Apply this existing tool for "is a" to research articles in a subfield to find terms and their definitions, and where first defined.
-  * Potential users - dictionary for software engineering research and nl tools
-  * Can find tools and what used for
-Approach:
+[[http://hiper.cis.udel.edu/udsacl/doku.php/research/nlpa/preethadissertation|Dissertation]]
-  * googlism approach - generalize beyond looking for 'is a'
-  * Read Marti Hearst's paper to get ideas
-  * identify a set of key phrases, positives,. etc
-  * Samir Gupta has code to do this.  Apply his code to a set of icse papers and see what you get.
-  * "such as",.. "including a, b, and c"  tells me about a, b and c
-  * Need to show it can be done and it is scalable to millions
-  * Goal: make it scalable, not every sentence of every paper.
-  * Start with 'is a', look for those sentences. where do they come from?
 **
-Spring/Summer 2016:**
+Spring/Summer/Fall 2016:**
 Identifying/Characterizing Facts and Advice From Mixed Text-Code Artifacts: