Pages: [1]   Go Down

Author Topic: Documenting my journey into FST4W using WSJT X, Mac Mini M1, and Icom 7300  (Read 165 times)

WB6YRW

  • Member
  • Posts: 195

This thread picks up where I left off in the eham.net forum:  "Antennas, Towers and More forum "Next Chapter: life after sky loops --> vertical antenna exploration"

As mentioned in the other thread, Rob Robinett AI6VN asked me if I was interested in allocating some of my antenna farm and long running station beacon resources to FST4W, WSPR like transmission beaconing.  Rob is one of the key developers behind the kiwi SDR WSPR and FST4W mode receivers...along with his passion and dedication to hosting and sustaining the core WSPR servers driving WSPR Rocks analytic servers.

-------

During my 6m WSPR initial testing, I elected to try WSJT X connected to my Mac Mini M1 (Big Sur) and Icom 7300.
As documented in the other thread, "Next Chapter: life after sky loops --> vertical antenna exploration", I have been non stop running WSPR beacons for almost 3 years now, which I define as  a "long running" beacon. 

--------

First Challenge / Issue: WSJT Crashing on Mac Mini M1--> 250+ hours of experiments to debug/resolve --> and a solution

After successfully installing, configuring, and operationalizing WSJT X, i began running into crash issues when I attempted to operate WSJT X on Mac Mini M1 beyond 24 hours.  The Zachtek units I am using for WSPR have been running for well over 2 years nonstop, and my intent was to do the same with WSJT on my Mac and the Icom 7300 for FST4W.  I initially began my configuration and testing on 6m WSPR using WSJT, the Mac and the Icom 7300... where I subsequently encountered the crash and began the following journey of trying to resolve this issue.

I posted a question about this crash (in this forum) and I eventually reached out to James, K0UA.  James is great, and he provided me some directional guidance but I was on my own because so few people are performing long running WSJT X beacon transmission work.
I also posted this question, at the suggestion of James, on the WSJT group.io forum, which seems to be the watering hole for all things related to WSJT.

---------

Situation:

Operating WSJT X v2.5.2 or v2.5.4 on Mac Mini M1 chip (Big Sur) using Icom 7300 (latest firmware; new unit as of Feb 2022) consistently halts WSJT X (regardless of version aforementioned) when nonstop transmitting between 24 hours to 25 hours at 50% tx duty cycle and between 350 to 375 logged transmissions in the  WSJT X main screen log.

Experiments and outcomes:

1. Experiment 1 series: Wrapped a ferrite type 43 donut with the USB cable (10 loops) between Icom 7300 to Mac Mini M1.  Ran 10 experiments (Experiment 1a, 1b, etc. where a, b, etc. each same experiment exactly repeated) with this configuration and no variable changes.  10 experiments for 10 days nonstop.  First 5 experiments used 1 watt output.  Next 5 experiments used 50 watts. All 10 experiments stopped transmission between 24 hours to 25 hours at 50% tx duty cycle and between 350 to 375 logged transmissions in the WSJT X main screen log.  While many people have stated the ferrite donut traps the RF related issue, my testing in my environment has not led me to this conclusion, empirically.  YMMV.

2. Experiment 2 series: Same configuration and test plan as Experiment 1 with one variable change: inserted a USB connector that disconnects pin 4 ground, which was successfully used by one person on the main@WSJTX.groups.io forum.  10 experiments for 10 days nonstop.  First 5 experiments used 1 watt output.  Next 5 experiments used 50 watts. All 10 experiments stopped transmission between 24 hours to 25 hours at 50% tx duty cycle and between 350 to 375 logged transmissions in the WSJT X main screen log.  Note: I kept the ferrite type 43 donut in this experiment. While a few people stated the ground pin 4 issue on the USB is the issue causing the problem, my testing in my environment has not led me to this conclusion, empirically.  YMMV.


For the each experiment 1a, 1b, ...and 2a, 2b,... etc. I kept an experiment log recording how much system RAM was available as soon as WSJT launched and how much used at crash.  I also logged how much RAM the WSJT X application was consuming as soon as WSJT was started as well as how much used at point of crash.

The Mac Mini M1 has 16 gigs of system RAM.

Observations set 1:

* In none of the 20 experiments did system RAM usage go beyond 13 gigs of RAM used, or 3 gigs of unused RAM remained available prior to 'crash"

* In none of the 20 experiments did the WSJT X RAM usage go beyond 300 megabytes used at point of "crash" (this 300 meg was actually an outlier) and on average, the average was 160 megabytes +/- 5 megabytes used by WSJT X at point of "crash'

Observation set 2:

1. Crash Definition / Observation: 

As a software developer, when I test and report bugs, a hard crash is when the entire application sw is 100% non responsive; also known as either a P0 or Sev0. 

* I did encounter two instances where a P0/Sev0 occurred out of the 20 experiments.  I could never consistently reproduce this issue.
 
* The other 18 experiments I would call a P1/Sev1, which typically is defined where a major aspect of the sw functionality/feature is non responsive but other parts of the application are still functional.  I was consistently able to reproduce this type of P1/Sev1 issue.

Observation set 3:

I set WSJT X to run at 50% duty cycle, which means 50% of the allowed WSPR allotted transmit time is transmitting and the other 50% is not transmitting.  When set at 50%, this equates to 13 to 14 WSPR transmissions per hour. 

* When I observe my lab notes, P1 crash occurs consistently between 350 to 375 tx; a relatively "tight" band of transmissions logged in the WSJT log monitor at point of crash

Observation set 4:

When a crash occurs, I am typically at work and cannot restart WSJT without taking my lab "crash" notes.  When I get home, I write my lab notes, quit WSJT, restart WSJT, and resume transmitting.  Common theme surfaced, however, is that I typically observed approx 24 to 25 hrs of run time at 50% duty cycle before a crash occurs.

* Sometimes I get home from work early. Some times late at night.  This means that regardless of time re-starting WSJT (I thought there might be a transient electrical spike or RF from my home solar panels or the EV supercharger or the fridge in the garage starting...or.... ) I observed 24 to 25 hours of consistent transmission time from restart time stamp.

---------

After completing these experiments this past several weeks, I needed to begin looking for a solution in my environment so that I can perform long running FST4W long running beacon experiments.

--------

Solution

After performing some Googling and thought work performed at companies the past 30 years, I thought about a Mac-like TimeMachine app (the Mac sw to automatically perform scheduled backups) except the Mac app would need to schedule turning on/off a 3rd party Mac application...well it turns out there is a built in Mac application called "Automater", which is a drag and drop visual macro scripting language which enabled me to start WSJT at a specific time, quit/exit WSJT at a specific time, and restart WSJT at a specific time.

I watched some YouTube videos and spent about 5 hours learning to use Automater with WSJT. Success!

Without going into super details, high level steps I successfully used:

1. Launch Automater
2. Select the Calendar based Automater function
3. Select Launch app or the Quit app function (depending if your starting or quitting WSJT in order to reset the memory issue / log issue I previously documented and observed)
4. Select WSJT app
5. Select TX Button in WSJT .... Note: you need to use the following Automater function: "Record and Play Back Mouse Activity" in order to select the TX Button in WSJT. (this took me the longest research time to discover and use this tool; once I discovered this function, it was pretty straight forward after watching youtube videos)
5. Run. Test. Debug. Save. Done

6. Quitting WSJT is the same as Starting WSJT macro script (I made two separate scripts, one to start WSJT and one to quit WSJT) except instead of Selecting the TX button, I selected the File/Quit function at the top of the WSJT navigation bar, which quits out of WSJT.

7. Once you have named and saved your own Start WSJT and Quit WSJT macro, you can then create a calendar event using the Apple built in calendar.  I watched a couple of youtube videos how to launch Automater macros from within the Apple calendar.  Very easy and straight forward.

 
Example

I start the WSJT application "Start WSJT" at 3:00PM local time, for example, in the Apple Calendar.  I quit the WSJT application "Quit WSJT" in the Apple Calendar approx 24 hours later or 2:58PM local time the following day.  Two minutes later, the Apple Calendar Starts WSJT. Rinse and repeat.  I then made both calendar automater events run each day into perpetuity.   Impact - Apple calendar shows the start/stop/start WSJT events in Apple Calendar. Done.

I have run this marco experiment twice so far, and so far so good...no issues (I just finished this last night around 2AM)  I watched WSJT start by itself, quit by itself, and restart by itself.  Impact - WSJT monitor log is cleared, RAM memory cleared, and the WSJT FST4W and 6m WSPR currently runs non stop, un-attended, no crashing so far.

= hopeful long running FST4W can now begin.

-------

Thought I would share these experiments, outcomes, and workarounds for WSJT users (regardless of Mac or Windows) if your performing long running beacon transmissions.

---------

I need to perform a couple of more antenna trim / SWR tunes for FTS4W 160m, but as of right now I am transmitting on FTS4W 160m.

More on this journey to follow.

-stu



« Last Edit: May 28, 2022, 12:26:13 AM by WB6YRW »
Logged

K0UA

  • Member
  • Posts: 9589

Please keep us informed. What is your gut feeling about the cause of the hang up? You have developed a "work around", and I am sure it will be successful, but we still don't really know the reason for the hang up.
Logged
73  James K0UA

WB6YRW

  • Member
  • Posts: 195

Hi James,

Yes, you are absolutely correct; I jumped from experiments/observations to the solution while not explaining the bridge leading me to this workaround. 

The short answer, my gut and instinct, tells me this is pointing toward a WSJT software code issue or unintended use case limitation.  There will need to be others who perform long running, nonstop WSJT / WSPR FST4W like controlled experiment transmission experiments to validate or invalidate this supposition.

Explainability

* When solving these types of complex issues/challenges, experience taught me to always go after the low hanging fruit possible solutions and increasingly and incrementally progress toward more complex experiments to solve; let the data from these experiments and outcomes drive the increasingly and incrementally more complex experiments and possible solutions

* I divided the "crash" issue into three different "areas of concern" as opposed to one big problem: (1) WSJT sw bug/issue/unintended or unforeseen use case, (2) RF related issue, and (3) Mac related issue

(1) RF related issues seemed to me the fastest and easiest experiments to test and use as a possible solution.

(a) The lowest hanging fruit experiment and outcome involved wrapping the USB cable around the toroid donut. I had a spare donut from the antenna experiments so that was a no brainer to try. Fast and easy but no successful outcome.

 (b) The next lowest hanging fruit experiment involved the USB interconnector / pin 4 ground jumper suggestion. $10 bucks for a pack of 4 from Amazon. Fast and easy but no successful outcome

At this juncture, I began to be suspect this "crash" issue is not RF induced...in my situation / configuration.  Searching for other possible RF related Internet topics related to WSJT crash issues surfaced no other new possible RF induced solutions (that I could find).  I paused, at this juncture: while there remains a possibility of an RF induced issue in the Venn diagram of possibilities, this RF induced supposition, in my operating environment, is now materially downgraded as a source crash culprit.

(2) I turned to the next relative fast and easy experiments: searching, reading and exploring the most stable Mac OS versions.  At the time of this research, Mac OS Monterey was being released; as a developer, I always stay intentionally a little behind when updating OSs for obvious reasons. I opted to stay on Mac OS Big Sur; most Mac M1 users who had been using WSJT stated they were successful using WSJT with Big Sur on the Internet; no feedback about Monterey...hence I stayed on Big Sur.  Again, is it possible there is a Big Sur issue, in the Venn diagram of possible crash culprits, but this seems minimal in my research.

At this juncture, the process of minimizing 2 out of 3 areas of concerns have been reasonably covered...leading me to WSJT software

(3) WSJT software - I performed a three phased, divide and conqueror approach to these experiments, tests, and outcomes
(a) Identify the most stable version of WSJT for Mac and use that version
(b) Begin the battery of experiments listed in the first post above
(c) Explore memory related issues - the most difficult to trace when not looking at code; and even when looking at code ...very time consuming.

I had read a few posts (ungrounded) that WSJT v 2.5.2 might be a little more stable than v2.5.4 for the Mac.  I originally installed v2.5.4 and performed the 2 areas of concerns (RF and Mac OS) aforementioned tests on v2.5.4, which all crashed.

I subsequently uninstalled v2.5.4 and installed the earlier version, v2.5.2.  End results was the same...similar crash after 24 to 25 hours of nonstop transmission.

At this juncture all the fast and easy experiment to resolve the crash have been reasonably conducted.

I subsequently began the battery of experiments and tests posted in the very first post of this thread.

After running these experiments and looking at the results, there are three possible WSJT sw issues that could be the cause:
1. A memory leak within WSJT, which I cannot easily resolve
2. A memory management issue, which I cannot easily resolve
3. An unintended use case issue: Perhaps performing long running beacon transmissions was not considered in the product use cases --> and there is a "ceiling' variable unintentionally used to preventing nonstop transmissions.

Again, this is not conclusive, but I am leaning more toward a sw related issue.  Based on that intuition and seeing that I could perform a quit and restart on WSJT...that lead me to finding / using the Automater.

Over the course of the next couple of week we will see if the Automater is the workaround and become another for rigid data point that RF and Mac OS is or is not the cultprits.
 


Logged

N6YWU

  • Posts: 362
    • HomeURL

You can use the Activity Monitor app (in Application : Utilities ) for monitoring wsjtx memory usage over time.

For long running FT8 beacons, instead of running the full WSJT-X GUI, which has a bunch of stuff un-needed for beacons, I run ft8code, which is hidden inside the macOS wsjtx.app bundle (and also inside the Raspberry Pi distribution), to encode the message.  And then run a basic program or python script to create the FSK waveforms and ship them out to audio at the proper time. 

Since this does not require a lot of computer performance, I don't run it on my M1 MacBook, but on a headless Raspberry Pi, which I have colocated with the transmitter (metal cases grounded together).  So very little opportunity for RFI to get into long USB cables, HDMI cables, etc.
Logged

WB6YRW

  • Member
  • Posts: 195

You can use the Activity Monitor app (in Application : Utilities ) for monitoring wsjtx memory usage over time.

For long running FT8 beacons, instead of running the full WSJT-X GUI, which has a bunch of stuff un-needed for beacons, I run ft8code, which is hidden inside the macOS wsjtx.app bundle (and also inside the Raspberry Pi distribution), to encode the message.  And then run a basic program or python script to create the FSK waveforms and ship them out to audio at the proper time. 

Since this does not require a lot of computer performance, I don't run it on my M1 MacBook, but on a headless Raspberry Pi, which I have colocated with the transmitter (metal cases grounded together).  So very little opportunity for RFI to get into long USB cables, HDMI cables, etc.

Thank you for the tips Ron.  I went to your site; very pretty cat, btw.  Sorry for your loss.

All of the experiments used the Activity Monitor.

I am intrigued with the possible FST4W pi possibility; very similar to the headless Zachteks; I am running 4 Zachtek's for WSPR. Pure joy in simplicity and operation.

Do you know if anyone built/documented their FST4W WSJT tx pi builds?

Thanks.

-stu


Logged
Pages: [1]   Go Up