This is the informal research journal Padmini kept as a way to log goals
and accomplishments on a (nearly) day to day basis. This journal is reported
as being very helpful in staying on task and communicating with Padmini's
mentor. Padmini would also like to note that on most days she updated
entries in the morning, hence there might be a lag of a day in reporting
to-do items completed.
Please be aware again that this is an informal research journal.
The Odd Note:
Sudarshan is the graduate student Padmini worked with and the files
folder mentioned in some of the logs refer to an older version of this site
where Padmini posted all relevant files (screenshots, code, documentation,
data, etc) so that her mentor and graduate student could stay informed of her
work.
Week:
[1] [2]
[3] [4]
[5] [6]
[7] [8]
[9] [10]
Week 1 || June 28 - July 2 |
Weekly Goals:
Learn VHDL; install software (Xilinx and Synplicity) on the machine
used; write code for adder as an example; write testbench, and simulate
adder; synthesize adder.
6/30/04
To Do:
1. Finish synthesizing adder.
2. Figure out how to change physical configuration of adder.
3. Find out timing delay of adder.
4. Install the new version of EDK. Sudarshan says the trial version has
the tools we need.
5. Update journal.
Done:
1. Get a good idea of VHDL. VHDL seems so archaic, lot of little rules.
2. Simulate adder. It took me a couple hours because I was debugging the
VHDL code.
3. Physical configurations come later.
4. Trial version not needed right now, no new installs of EDK.
5. Journal updated atleast several times a week, often two or three
times a day.
7/1/04
To Do:
1. Understand how to read input vectors from file.
2. Understand how to parameterize entities using the generic parameter.
Done:
1. Reading from a file + read a vector + read a bit out of a vector.
2. Parameterize entities - generics.
[To the Top] |
Week 2 || July 5 - July 9 |
Weekly Goals:
Create, simulate, synthesize, modify physical configuration of
multiplier; Learn how to constrain shape and timing in simulation and
synthesis;
7/5/04
To Do:
1. Go thru all of Sudarshan's code, retest and generally understand
state machine and testbench.
2. Synthesize with .sdc (force logic, uses LUT's) and without .sdc
(embedded), check timings.
3. Look at floorplanner to get an idea of the default layout for each of
the above.
4. Introduce .ucf, constraint file, to force LUT's into particular
columns. Use floorplanner to double check. This step might not work,
might need time to figure out how to get it to work.
Done:
1. Tested all of code and understood state machine. State machine is
pretty general. Can change testbench to allow for different input sizes
and change the number of inputs to the matrix multiplier.
7/7/04
[In this folder] I've posted any interesting code and screenshots
from Floorplanner.
To Do:
1. Either consolidate all the screenshots into coherent bigger pictures
or order them better.
Done:
1. Synthesized multiplier with and without embedded multiplier.
Curiously the simulation times for multiplications were the same.
2. Done simulating, synthesizing and looking at floor planner for the
multiplier using embedded multiplier and forced logic. In the process of
figuring out how to constrain files.
[Add-on] To the
left is the screenshot of the multiplier without the embedded
multiplier. The right is with. It's hard to see the components because
they occupy so little CLBs. The one that uses the embedded multiplier is
clustered around them as embedded multipliers are clustered in the
middle and at the ends of the chip. The multiplier that doesn't use the
embedded multipliers is away from them and uses strings of CLBs for
multiplication, as can be seen in the middle of the screenshot at the
left.

7/8/04
To Do:
1. Re-simulate all the constraints and look at post-route times.
2. Get ready for demo - code, simulation, synthesis, screenshots of
floorplanner, discussion.
Done:
1. Getting the constraint files to work was surprisingly easy with the
PACE program. PACE is launched from project navigator, just have to
double click on the .ucf file to launch. Originally I'd planned to play
with the .ucf file manually, but decided I'd rather graphically
manipulate the constraints and then look at the generated .ucf file. The
generated files are not that different from the original file, and it's
useful to see how it's changed. I've put up different copies of the .ucf
files for the different constraints in my research folder (link above).
2. Simulated, synthesized and floorplanned 2, 4, 6 and unconstrained CLB
column constraints for LUT multiplier and matrix storage. Found no
difference in multiplication times, which is odd considering the routing
is different for each. Sudarshan brought up a good point today, I was
only looking at the simulation after post-map, not post-route, so today
I'm going to re-simulate everything and look at the post-route times. I
suspect this will account for the routing delays.
7/9/04
To Do:
1. Figure out why post place and route simulation won't run properly and
get it to run.
Done:
1. The way the screenshots are organized are fine. Didn't want to waste
more time to make them marginally prettier.
2. Though it was exciting to get the simulations results above, it
turned out to be wrong. We were only looking at the post-map
simulations, our biggest clue was that all the times were the same, even
though the routing was different. Looking at the post-place and route
simulation, there was a lot of latch and setup timing violations.
Apparently by the time the states in the state machine are decoded,
there's very little clock cycle left for the actual computation to take
place. Hence the setup violations. We spent all day yesterday trying to
figure out what was wrong. Changed the state machine so that it only had
a single dimension matrix instead of a two dimension matrix, still had
timing violations. Increased clock time to no avail. We didn't test the
circuit at a lower frequency, so that's what I'm going to do right now.
Hopefully we'll progress towards solving the problem by the end of
today.
3. Demo didn't happen because of the many errors we had with the current
code. Perhaps next week, once everything's working and the site is
updated.
7/10/04
To Do:
1. Re-simulate all constraints with new vhdl files.
2. Put up new screenshots of constraints.
3. Reconfigure FPGA.
4. Repeat with different shapes and delays.
Done:
1. Re-simulated all constraints with new vhdl files. There are no errors
at this point. Just have to update the data to the webpage.
2. Reconfiguring FPGA will come much later when we have more complex
circuits to work with. At this point the multiplier is simple enough
that the place and route simulation gives results that are significantly
similar to the real timings on the board. As to different shapes and
delays, we are doing that with the different constraints files.
[To the Top] |
Week
3 || July 12 - July 16 |
Weekly Goals:
Finish off last weeks stuff; plan out a matrix multiplier module;
write a description of matrix multiplier so that Eli, Sudarshan and I
are on the same page; secure other machine for use.
7/12/04
To Do:
1. Run floor planner and put up new screenshots.
2. Update code and data.
3. Start working on variations of multiplication, 2/3/4 multiplications.
For each different mult file, do atleast four or five constraints.
4. Update site with code and screenshots.
Done:
1. Spent all morning updating website with new floorplanner screenshots.
Set up a new naming convention for them. There's a readme file inside
the research folder that explains anything that needs explaining about
the contents of the folder. As I add or change the folder, I will update
the readme.
7/14/04
To Do:
1. Write brief description of what I'm going to do in the next couple of
days so that Eli, Sudarshan and I are all on the same page.
2. After Eli and Sudarshan talked, it was decided that running
simulations and synthesis was time consuming, with results that were
probably more accurate than we need. So instead I will be using CORE
Generator to generate modules that I will then test, constrain and
synthesize. The synthesis tools provide data sheets on how long the
produced modules take and also the maximum delay time. This
approximation of the upper limit on route and run time is more efficient
than taking precise measurements. Test, synthesize a single multiplier
(8bit x 8bit) module.
Done:
1. Spent all day yesterday simulating, mighty slow work waiting for them
to run. It's a combination of super slow machine + running programs off
of my home directory wahich is stored on some unix server somewhere.
Anyway, had to redo the simulations I finished on Monday because I'd
forgotten the .sdc file, the file that constrains the synthesis tools to
only use CLB's instead of embedded multipliers.
7/15/04
General Update: Nothing much is happening right now. Having computer
difficulties, current computer has very little memory and hard drive
space, making it really time consuming to run simulations while most of
the files are stored on a unix server elsewhere. We have secured a
computer with better specs, but are having problems accessing it.
Hopefully it will be resolved soon. Meanwhile I'm working on the
description for Eli and Sudarshan.
[To the Top] |
Week
4 || July 19 - July 23 |
Weekly Goals:
Update description after nixing the old implementation of matrix
mulitplier; learn how to use COREGenerator to create insta-modules of
applications that we'd like to use.
7/19/04
To Do:
1. Get other computer to work.
2. Rewrite description.
Done:
1. Installing windows updates in hopes that the other computer will
cooperate.
2. After much thought it was decided that fully pipelining the matrix
multiplier would be inefficient. So back to my original plan, put the
multipliers in parallel and register the stages. Had to start all over
again on the description, which is why it's taking so long.
7/20/04
To Do:
1. Finish up description.
2. Start writing code.
3. Try to fix other computer.
7/21/04
To Do:
1. Test to see how many cycles a COREgen multiplier actually takes.
2. Modify design if needed based on multiplier cycles.
3. Start writing code, using generate and generic.
4. Find out when Janet can come here to look at computer.
Done:
1. [Description posted] in files folder.
7/22/04
Hit a speed bump in terms of coding. I couldn't figure out how to
get the post map simulate to work. The first two simulations ran fine,
but the last two died with errors. Spent three hours yesterday browsing
help manuals and tutorials. I remember from my digital design class that
I have to add the path to XilinxCoreLib somewhere, but I'm not sure
where. Right now I'm going back to my class notes, hopefully that'll
show something.
Done:
1. Glad to get the core gen stuff working again. I ran the post-route
and placement simulations for just one multiplier and the waveforms
showed that the multiplier took a fraction of a clock cycle to compute.
So technically, input goes in parallel, result comes out parallel in the
next cycle. Earlier when looking at the data sheets for a parallel
multiplier, it looked like each bit of an input would go in serially.
Maybe the simulation itself can't show each bit going in per clock
cycle, maybe I'd have to run it physically on the board to double check.
But in the meantime, hopefully there won't be any multiplier related
timing issues with the testbench. I'm almost done with that. Then on to
some serious testing.
7/23/04
To Do:
1. Finish testing and debugging code.
2. Start constrain tests.
3. Device a graphical representation for data collected.
4. Create power point presentation: Purpose, methods, examples, data.
Done:
1. Finished writing modules and testbench. Have a funky error in
ModelSim where it just hangs on a process. I can't break the process,
ModelSim doesn't show as not responding in the task menu, and it doesn't
die right away when I kill the process either. There was one thread that
was taking a lot of memory, vish.exe. I researched the process and and
found a document, Agilent Technologies Advanced Design System 2001:
Release Notes that had a page on vish.exe. Interestingly it said,
"This document describes known defects in Advanced Design System 2001
and, wherever possible, provides workarounds. It also identifies errors
and omissions in the ADS 2001 documentation." So here's the page:
HDL
Cosimulation
Real ports must be initialized for some ModelSim versions for VHDL
entities In a VHDL entity, the real ports must be initialized for
certain versions of ModelSim SE (such as version 5.4d) or the ModelSim
will fail with an out-of-range error.
Stray processes need to be terminated manually
. On Windows platforms, if an HDL cosimulation is interrupted or it
errors out, stray processes may continue (and tie up the license).
. On all platforms, if any other component (non-HdlCosim component)
causes a runtime error while simulating a design containing an HdlCosim
component and the ADS simulation stops, the HDL simulator may still be
running (and tying up the license).
The workaround for both of these problems is to terminate the stray
processes manually. The processes that may need to be terminated are
vlm.exe, vsim.exe, vish.exe (for ModelSim.), and verilog.exe (for
VerilogXL.) You may also have to terminate hpeesofsim.exe.
Custom HdlCosim components no longer supported
Custom HdlCosim components created using the ADS model builder are
no longer supported.
Parameter values of HdlCosim components may require updating
Because of changes made to the HdlCosim component parameters in ADS
2001, parameter values of HdlCosim components in your ADS 1.5 designs
may need to be re-entered.
[To the Top] |
Week
5 || July 26 - July 30 |
Weekly Goals:
Create a ppt presentation for half way mark; constrain matrix
multiplier module at different columns and find performance times.
7/26/04
Done:
1. Done updating ppt presentation, saved in [files folder].
7/27/04
To Do:
1. Constrain matrix multiplier.
2. Update web files.
Done:
1. It's brilliant! The floorplanner view of the matrix multiplier is
beautiful, the different colors arranged in different parts of the
board, eight bright colors, one for each multiplier arranged in a
circular fashion with the striped, color pattern of the rest of the
adders in the center. Woohoo!
:Ahem: On to other news, the multiplier is done, when I finish running
all the tests, I will update the site, so you too can appreciate the
beautiful colors. :grin:
[Add-on] The
left floorplanner screenshot is of just the components of the matrix
multiplier without constraint. The one on the right is with all the
internal wires. It gives an idea of how intricate the wiring can get.

7/28/04
Just when you think you're done...
To Do:
1. Change all supposed CLB column numbers to x/2. Sudarshan mentioned
that 18 columns of CLBs seemed excessive for the matrix multiplier, and
he was right. After looking closely at the floorplanner board, we
realized that 1 CLB column was really 1 slice column. So the number of
CLB's is really half of what we thought they were. Have to go thru all
my data/notes/graphs/files to change the number of CLBs to the correct
number. But for now, I'm more worried about the presentation. Which
brings me to my next item...
2. Update slides with the right data. Sudarshan mentioned that he was
more interested in even columns of CLBs. Half of the CLBs that I thought
were even are really odd, so I have to redo half of the synthesii (the
correct form of plural synthesis I hope). I think I have just enough
time to redo the multiplier and the matrix multiplier. What really
complicates things is that for each test, I do three things: change
constraint files, save the place and route report and take floorplanner
screenshots. The floor planner screen shots are the most extensive since
they are broken down into three parts as well: full board views with and
without wires, and module zooms with and without wires. Ofcourse all of
this is more complicated by the fact that the matrix multiplier is only
judged by the place and route reports, while the multiplier requires
exact simulation to get nanosecond to nanosecond timing. Now I'm at a
crossroads, should I scrape the data I had before for the multiplier and
only look at the place and route reports to get delays or should I do
full simulations for each one? Each option averages out to be the same
amount of time. So in the name of uniformity, I'll just redo all of the
multiplier and only look at the place and route reports.
3. Talk with Eli and Sudarshan about the different criteria when looking
at the synthesis.
[To the Top] |
Week
6 || August 2 - August 6 |
Weekly Goals:
Create new application modules starting with FFT; figure out COREGen for
sure, couldn't get it to work last time; synthesize more modules; figure
how many more modules I can synthesize.
8/4/04
To Do:
1. Research FFT.
2. Write code for FFT.
3. Simulate and debug modules.
4. Synthesize with different configurations.
Done:
1. [Mid-summer report] was straight forward, all answers were text based
and brief. I am preparing for the final report by adding to a much more
elaborate file on the side. The final report will be much more expanded
with a full project description that contains all the details including
results and graphics.
8/5/04
It turns out that there is a smorgasbord of free code out there that
I can just use for most of the applications. This cuts down time spent
writing code. Right now I'm finding code for the FFT, and I have to make
sure the physical constraints are correct for it.
8/6/04
To Do:
1. Go thru multiply accumulator and direct digital analyzer
applications.
2. Synthesize six new applications, preferably big applications.
3. Write up the results.
Done:
1. Synthesized four applications yesterday: FFT, MAC FIR filter, 1-D
Discrete Cosine Transform and Sine/Cosine Lookup Table. Woohoo! It
really helped that I didn't have to write testbenches for them. I spent
a couple hours trying to figure out inputs for the FFT, what a waste of
time!, until Eli said forget about it. That sure took the pressure off.
So now I'm just focused on synthesizing the application modules.
Sudarshan needs single data points from many different applications. So
instead of synthesizing each application with different constraints, I'm
focusing on going thru as many as possible. In a week, or maybe a couple
days, I will go back to each application and carefully synthesize them
again several times.
[To the Top] |
Week
7 || August 9 - August 13 |
Weekly Goals:
Create testbenches for applications to find out computational clock
cycles, Sudarshan needs that data;
8/9/04
To Do:
1. Find out how many cycles an application takes to run. At this
point, none of us really know how to find out exact computation cycle
without fully simulating the module. I want to avoid doing that, so I'll
have to research the data sheets and the xilinx help pages to figure out
approximate computation time.
2. General goal: frequency of module vs columns. Main goal after
generating data for Sudarshan is to generate a range of data points to
find relationship between frequency and physical size of application.
For the next three weeks I will find this relationship for many
different applications. So by the end of my project, I will have data
for a dozen or so applications (hopefully) that will allow the user to
get an idea of the tradeoff between frequency and physical size.
Done:
1. Finished synthesizing eight new applications with time and column
constraints. Have a nice big table with data that I will upload. But
it's not the end point. Sudarshan is mainly looking at number of columns
vs number of cycles. So I will be researching the number of cycles.
8/10/04
Update: Research on computation time continues. I've uploaded the
data sheets for the applications in the [data_sheets folder]. They're
all pdfs. I've also uploaded two files, "Modules Descriptions" and
"Modules Data (single point)" in the [word_xcell_ppt_files]. The first
file contains descriptions and implementation options for all the
applications. The second file contains specific information about the
apps such as number of slices, columns, approximate max frequency, etc.
As I find the computation time I will update that to the files.
8/11/04
Update: Same old, same old.
8/12/04
Update: Yesterday in my rush to make sure all files from the local
harddrive was updated to my unix directory, I over wrote the latest
timing files. Fortunately I wrote down the period, delays, etc, I don't
have to do any deduction to get to the period.
Having trouble with modelSim, the liscense doesn't work because the
registration is for a different computer with a different harddrive
serial number.
8/13/04
Update: More simulation and testing of applications. Sometimes it
can be time consuming because I have to adjust the timing of the input
signals in the testbench to get the module to work right. This involves
referencing the data sheet, the coreGen implementations, and going back
and forth between code and waveform windows.
[To the Top] |
Week
8 || August 16 - August 20 |
Weekly Goals:
More applications to synthesize to meet minimum columns (atleast 10);
find computation times; although this data is used as part of my report
it is also being used in Sudarshan's paper which he is submitting for
publication.
8/17/04
To Do:
1. Find more applications that take up more than 2240 slices (112
slices/ column * 20 columns). Sudarshan mentioned that he wanted
applications that take up more than 10 clb columns, 20 slices.
2. Find maximum frequency, delays, computation time of these
applications.
3. Continue with my project. For each application find five points of
data. Constrained at +0, +4, +8, +12, +16 and unconstrained, find the
frequency for each. Use this to create a graph of module
characteristics.
Done:
1. Computational time of three biggest applications is as follows: FFT -
6213 cycles, 2-D Discrete Cosine Transform - 170 cycles, and CORDIC - 21
cycles. This is not enough data, I was hoping to get in atleast five or
six decent sized applications, but they are hard to find. Of the
original eight I synthesized, only two were bigger than 10 CLB columns.
2. Out of the 16 different applications I've tried so far, only 4 of
these meet the size requirements of Sudarshan, more than 10 CLB columns:
FFT, 2-D Discrete Cosine Transform, 1024-PT Complex Transform, and
256-Pt Complex Transform. I've emailed Sudarshan if he has more new
applications for me to try. Meanwhile I will return to my original
project and continue to find the relationship between physical area size
and frequency.
8/18/04
To Do:
1. Find out timing of more modules that take up 4 CLB columns. After
explaining the situation to Sudarshan, he lowered the threshold to 4 CLB
columns.
Done:
1. All done with FFT! Ran seven constrained synthesis, one of them took
over two hours to complete (!), the rest averaged about forty minutes.
It was indeed a long day. I found a discrepancy between how the
synthesis tools did a full board implementation. If the constraint
included the whole board, the synthesis placed the module differently as
opposed to if there was no constraint. Logically it shouldn't matter, in
the end the two constraints mean the same thing. But as I was running
the full board constraint, this distinction occured to me and I had to
test it out to see if it was true. I'll just include this in my data
sheets and show it to Eli.
8/19/04
To Do:
1. Get back to my project: Next module FFT256. That should take a whole
day synthesizing.
Done:
1. Figured out computational time for half a dozen more modules, updated
data spreadsheet, it's the files folder now.
8/20/04
To Do:
1. Sudarshan emailed me with questions about the data I gave him, have
to answer. Will require synthesizing modules several times to double
check data. Also have to double check module placement, I know I'm
right, but I want to double check. Lastly he wants input length and size
to get an accurate idea of computation latency.
Done: 1.
Fired off a long email to Sudarshan, updated data spreadsheet.
8/22/04
To Do:
1. On Friday, Eli mentioned that she wanted intermediary columns for the
FFT. The data changes abruptly, so it should be helpful to see how it
changes.
2. Synthesize the rest of the modules: cordic, 2d cos transform, matrix
multiply (done, but double check all data) and optional is fft 256 and
fft 1024. The last two fft's would be purely to compare against the fft
and I'll probably do it last.
3. Have to email Eli the data and explanatory docs.
4. Update spreadsheets. Data is there, but I have yet to put it in fully
order.
Done:
1. Sytnehsized 2-D Discrete Cosine Transform. Next up CORDIC.
[To the Top] |
Week
9 || August 23 - August 27 |
Weekly Goals:
Work on any applications Sudarshan needs; get apps for my research
paper; start work on final research presentation; start work on final
research paper.
8/23/04
To Do:
1. For his project, Sudarshan wants to use a jpeg compression
application. I have two modules to look up: an RGBtoYCL converter and a
quantize app. Xilinx support pages are my friends.
8/24/04
To Do:
1. Encoding for JPEG. Synthesize and simulate.
Done:
1. There are three modules used for JPEG encoding, RGB-YCrCb, 2-D DCT
Cosine Transform and Quantization. Done simulating and synthesizing.
Updated files folder with data, called JPEGdata.
8/25/04
To Do:
1. Start outline of paper.
2. Decoder, JPEG. Synthesize and simulate.
Done:
1. Slight problems with the downloaded code, inverse-quantize has
errors. Inverse-quantize, inverse-DCT and YCrCb-RGB have all been
synthesized, and with the one exception, simulated as well. Emailed
Sudarshan about code error, hopefully he can take a look at it as well.
Updated data spreadsheet.
[To the Top] |
Week 10 || August 30 - September 3 |
Weekly Goals: Last week! Finish up and practice
presentation; final paper; work on final version of website.
[To
the Top]
|
|